Batheme! cs 


AMERICAN 
JOURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


R. BRAUER F. D. MURNAGHAN 
UNIVERSITY OF TORONTO THE JOHNS HOPKINS UNIVERSITY 


L. M. GRAVES H. WHITNEY 
UNIVERSITY OF CHICAGO HARVARD UNIVERSITY 
A. WINTNER 
THE JOHNS HOPKINS UNIVERSITY 


WITH THE COOPERATION OF 


R. BAER Ss. BOCHNER G. BIRKHOFF 
J. DOUGLAS D. MONTGOMERY R. COURANT 
W. HUREWICZ Cc. L. SIEGEL P. HARTMAN 
N. LEVINSON D. C. SPENCER J. L. VANDERSLICE 
G. PALL M. A. ZORN A. ZYGMUND 


PUBLISHED UNDER THE JOINT AUSPICES OF 
THE JOHNS HOPKINS UNIVERSITY 
AND 
THE AMERICAN MATHEMATICAL SOCIETY 


Volume LXVIII, Number 2 
APRIL, 1946 


THE JOHNS HOPKINS PRESS 
BALTIMORE 18, MARYLAND 
U.S. A. 


JUN 3 1946 


CONTENTS 


Linear variations of constants. By AUREL WINTNER, 
Fixed point theorems for multi-valued transformations. By SAMUEL 
EKILENBERG and DEANE MONTGOMERY, . 
The values of the norms in algebraic number fields. By AUREL WINTNER, 
The asymptotic number of Latin ——— wiih Pavut Erpis and Irvine 
A matrix differential equation of Riccati "By T. 
Reversibility and two-dimensional airfoil theory. By GARRETT BirKHOFF, 
"A limit theorem for random variables with infinite moments. By W. 
Scale hypersurfaces for conformal- space, Yune-CHow 
A factorization of the densities of sie inate in etniie number fields. 
By AvreL WINTNER, 
The fundamental lemma in Dirichlet’s Ginory of the 
gressions. By AuREL WINTNER, . 
Asymptotic integration constants in the aiid of Briot- Bouguet 
By AUREL WINTNER, . ‘ 
On the asymptotic behavior of the wtitions of a non-linear differenti: al 
equation. By PuHi~tip HARTMAN and AUREL WINTNER, . 
Converse linearity conditions. By R. H. Brne, 
A density theorem for power series. By R. P. Boas, JR., . , ; 
A solution theory of the Mobius inversion. By AuREL WINTNER, 
Metrically homogeneous spaces. By HERBERT BUSEMANN, ‘ 


The AMERICAN JOURNAL OF MATHEMATICS will appear four times yearly. 

The subscription price of the JourNAL for the current volume is $7.50 (foreign 
postage 50 cents) ; single numbers $2.00. 

A few complete sets of the JOURNAL remain on sale. 

Papers intended for publication in the JouURNAL may be sent to any of the Editors. 

Editorial communications may be sent to Professor F, D. MURNAGHAN at The Johns 
Hopkins University. 

Subscriptions to the JouRNAL and all business communications should be sent to 
THE JOHNS Hopkins Press, BALTIMORE 18, MARYLAND, U. S, A. 


Entered as second-class matter at the Baltimore, Maryland, Postoffice, acceptance for mailing at special 
rate of postage provided for in Section 1103, Act of October 3, 1917, Authorized on July 8, 1918. 


PRINTED IN THE UNITED STATES OF AMERICA 
BY J. H. FURST COMPANY, BALTIMORE, MARYLAND 


PAGE 
185 
214 
223 
230 
23% 
247 
257 
263 
293 
285 
293 
301 
309 
319 
321 
340 


\ 


LINEAR VARIATIONS OF CONSTANTS.* 


By AUREL WINTNER. 


1. Let D= D, be an homogeneous, linear differential operator of order 
n, with coefficients which are single-valued and regular near a point, say the 
origin, of the z-plane. Then every solution w = w(z) of Dw =0 is a linear 
combination of a finite number of functions each of which is of the form 
z\(log z)'L(z), where A is one of the “characteristic exponents,” J a non- 
negative integer not exceeding n—1, and L(z) a Laurent series convergent 
near z= 0. This is a classical result of Fuchs (1866), who also determined 
the conditions under which no L(z) has an essential singularity at z= 0, 
that is, under which the expansion (near z 0) of every solution w(z) can 
be calculated recursively (rather than only by using infinite determinants or 
equivalent processes). These conditions, subsequently found in Riemann’s 
posthumous notes also, prove to be fundamental, since they do not involve the 
knowledge of any solution: Necessary and sufficient is that, when the coefficient 
of the n-th derivative of w in Dz = 0 is 1, the singularity of the coefficient of 
the k-th derivative be a pole of order n —k (at most), where k ~0,1,-°-°. 

The necessity of this criterion, though essential in the hypergeometric 
theory and its generalizations (Riemann; Pochhammer), is quite on the 
surface. In addition, it is of an accidental nature, in the sense that it ceases 
to hold when Dw = 0 is generalized to a system of n linear differential equa- 
tions of the first order, vw’ = F(z)w, where w = w(z) now is a vector with n 
components and F(z) a matrix of n times n functions. 

On the other hand, the sufficiency of the criterion can be transferred to 
this, more general and symmetric, case without any additional trouble. In 
fact, the sufficiency of the criterion then states simply that all Laurent series 
occurring in the general solution of w’ = /’(z)w are free of essential singu- 
larities whenever the matrix F(z) has at z= 0 a simple pole (at most). 

All the known proofs of this criterion (as well as of its particular case 
belonging to Dw =0) are arduous. They are more or less straightforward 
variants of two main types. The first type of proof is substantially that of 
Fuchs. This proof considers, first under the mere assumption that F(z) is 
single-valued and regular near z = 0, the local monodromy group, consisting 
of the substitutions to which the solution vectors w = w(z) are subjected by 

* Received October 23, 1945. 


185 


186 AUREL WINTNER. 


a circuit about z—0. It then puts the monodromy matrices into their 
(common) Jordan normal form. From this, it is possible to conclude that 
all solutions are linear combinations of a finite number of functions each of 
which is composed of three factors z\, (logz)', Z(z). Finally, it is shown 
(and this can be effected in various ways) that no L(z) can have an essential 
singularity if the singularity of the coefficient matrix is a simple pole. 

In contrast, the second type of proof (which, in the particular case of 
Dw = 0, goes back to Frobenius) confines itself to the case of a simple pole 
from the beginning. It consists in proving, by the method of undetermined 
coefficients, that the expansions of the solutions to be supplied by the answer 
all exist and form n, and not less than n, linearly independent solution vectors. 
And this requires, besides a convergence proof, a counting of constants, which 
becomes elaborate not only when one of the above integers / becomes distinct 
from 0 (that is, when the matrices of the monodromy group have a multiple 
invariant factor) but also when the difference of two of the (normalized) 
characteristic exponents A happens to be a real integer. What makes matters 
worse in this formal theory is that, just as in the theory of the hypergeometric 
equation, the case of (ostensibly) exceptional A-sets has no function-theoretical 
significance, since the monodromy group determines just the residue class 
(mod 1) of an exponent A, rather than A itself. 

Both of these proofs (the “Riemannian” and the “ Weierstrassian ”) 
apply the full force of the theory of the elementary divisors; the first, via the 
normal form of the monodromy group, the second, as a counting machine. 
But a glance at the explicit form of the final result shows that what is actually 
proved can finally be worded so as to involve neither the characteristic numbers 
(=e) nor the elementary divisors of the monodromy group; namely, as 
follows: 


(i) If all n? elements of the coefficient matriz, F = F(z), of w’ =F (z)w 
are regular in a circle |z| <a except for simple poles at z=0, and if R=Rr 
denotes the residue (zF(z))z=0, then one solution matrix is a matrix product 
of the form P(z)z®, where P(z) is a matrix regular at z=0 (that is, 
P(z) = P(0) + P’(0)z+ if |z| <a) and det P(z) £0, 
if0<|z| <a. 


It is understood that by a “solution matrix” is meant any matrix the 
columns of which are solution vectors (cf. (12) below), and that det P(z) #0 
makes the solution matrix in question a “fundamental matrix.” 

The matrix multiplying P(z) is defined, of course, by 2* = e#!s?, 
where e4 denotes the matrix 


eA A™"/m! 


m=0 


LINEAR VARIATIONS OF CONSTANTS. 187 


and log z is thought of on its Riemann surface. Since e4 has the period 2mE, 
this agrees with the fact that only e”**, that is, only the residue class (mod 1) 
of a characteristic exponent A, is defined by the monodromy group. 

If Jordan’s normal form, say J, of A is known, it can be used in order 
to “sum ” the infinite series of e4. In fact, if A= TJT-, then e4 = Te/T-, 
and e” can be “summed” by using, for the powers of J, the recursion formula 
supplied by Cayley’s theorem, f(B) —0, where f(s) =det (s# —B). But 


? 


the significance of such “summations” is secondary indeed, and so the truly 
function-theoretical wording of the classical result, a wording based on the 
exponential function z” rather than on the counting machine of the elementary 
divisors of R, is just (i). 

An experimentum crucis is supplied by infinite matrices R which are 
bounded in Hilbert’s sense. Then 2” is defined as before and is a non-singular, 
bounded matrix (at every z40). But now there is no Cayley. theorem 
available for “summation” purposes and, what is much worse, no analogue 
of the Jordan-Weierstrass theory: can exist (Toeplitz), not even in the 


apparently harmless case of a completely continuous R#. 


2. Let the dotted circle, 0<|z|<a, be replaced by an interval, 
0<t<a, and let the pole of the coefficient matrix be split off by placing 
F(t) =t"T + G(t) in (i). Then the “exponential” wording of the classical 
theorem suggests the possibility of an extension to the case in which the 
deviation, G(t), of F(t) from the principal part, "FR, is not a regular power 
series but a function restricted by “real” smoothness assumptions only. 
Neither of the classical types of proof, sketched above, is then available (the 
second not, because there are no coefficients to be compared, and the first not, 


“circuits,” hence no monodromy matrices, in the one- 


because there are no 
dimensional case). Nevertheless, the extensions in question prove to exist, 
at least under the assumption that the constant matrix FR, representing the 
formal residue of F(t), is “small” enough; the limitation of its “size” 
depending only on the dimension number, n (rather than on the choice of 
G(t) =t°T — F(t) also). 

In order to define this notion, let 


A| denote the greatest among the n? 
absolute values | aix |, if (ai) is the matrix 4. Then it is clear that 


(1) |AB|Sn|A||B| and |A+B/S/A|+[Bl. 
Hence, if [X] = [X]z denotes the matrix 


(2) [X] — RY — XR, 


188 AUREL WINTNER. 


there exists a number r satisfying the inequality 
(3) 


where the matrix X is arbitrary, the matrix FR is fixed, and the factor r= rp 
is independent of X. Let r be the least factor satisfying (3) for every X. 
This unique rrr will be called the cross-modulus of the matrix R. 
According to (1) and (2), the cross-modulus is subject to the inequality 
(4) rs 2n| 

no matter what R may be. 

By using the appropriate metric, it is possible to extend the definition of 
|A|, and therefore that of the cross-modulus, to the case of Hilbert’s space 
(and other linear spaces). The following proofs then require nothing but the 
customary transcription. 

A “size” of the formal residue being defined by its cross-modulus, the 


theorem announced above can now be formulated as follows: 


(io) Let R be a constant matrix the cross-modulus of which is less than 
1, and let G(t) be a matrix function which 1s continuous on an half-open 
interval, 0< t= to, and remains bounded, 


(5) G(t)=O(1), as t>+0. 
Then one solution matrix of 
(6) a’ =F (t)z, where F(t) =R/t+ G(t), 


is a matrix product of the form 


(7) P(t)t®, (1% == ow), 
where 
(8) P(+ 0) exists 


(as a finite limit) and 
(9) det P(+ 0) ~ 0. 
[ Actually, (8) can be improved to 
(10) P(t) =P(+0) +0(1), 


as t>+0.] 


he 


an 
en 


LINEAR VARIATIONS OF CONSTANTS. 189 


If (io) did not restrict the residue of the principal part of F, it would 
represent a complete dual of (i) for the non-analytic case. What concerns 
the secondary part of F, the assumption (5) is sure to be satisfied if G(t) 
is uniformly continuous. But (5) does not assume this, that is, G(+ 0) 
need not exist. 

The proof of (io) will depend on a simple “ Abelian ” lemma (in contrast 


to some “ Tauberian ” consideration ). 


3. Let X, U, A,--- denote matrices (with n rows and n columns), 
and let a’ denote da/dt, whether a= X or a2, where ¢ is a real variable. 
If A(t) is a continuous function on an interval, then 


(11) a == A(t)z 


has on the whole interval a unique solution x = x(t) which, at a point to of 
the interval, becomes an arbitrary initial vector, x(t)). If X = X(t) denotes 
the matrix the columns of which are n solutions, say z1,---, 
then, since 


(11) is equivalent to 
(13) X’ == A(t)X, 


and the determination of the general solution of (11) is equivalent to the 
determination of a solution of (13) in which the columns are linearly 


independent, that is, 
(14) det X(t) 0. 


Since, if tr A denotes the sum of the diagonal elements of A, 


t 
(15) det = det X (to) exp tr A(s)ds 


to 


is an identity in ¢ by virtue of (13) alone (Jacobi), it is needless to specify 
whether the linear independence of the columns of (12) be meant for some ¢ 
or for every ¢. Clearly, (13) remains true if X is replaced by XC (but not 
by CX), where C is any constant matrix. Hence, if a solution of (13) satis- 
fying (14) is called a fundamental matrix of (11), a matrix is a fundamental 
matrix of (11) if and only if it is a product Y(t)C, where X(t) is some 
fundamental matrix and C denotes a constant matrix of non-vanishing 


determinant. 


TR 

R. 

of | 
ice 
he 
_| 


190 AUREL WINTNER. 


Instead of considering, as before, just one system of homogeneous linear 
differential equations, consider two of them, (11) and 


(16) =B(t)y 
or, equivalently, (13) and 
(17) Y’ = B(t)Y, 


where A and B are arbitrary continuous functions on a (common) f-interval. 
If (11) is to be transformed into (16) by the procedure of the variation of 
constants (Lagrange), then, since X(¢) had to be placed in front of the con- 
stant matrix C,, the “varied” form, say U = U(t), of C = const. must be set 
up as follows: 
(18) Y = XU. 
And this leads to the 

Rule for the Variation of Constants from the Right. Jf X(t) is a 
fixed fundamental matrix of (11), all the fundamental matrices Y(t) of (16), 
and only these matrices, are represented by the product (18) in which U(t) 


denotes an arbitrary fundamental matrix of u’ = K(t)u, that is, any solution 


U =U(t) of 


(19) U’ = K(t)U 
satisfying 
(19 bis) det U(t) ~ 0, 


where the coefficient matric, K = K(t), is the ordinary transform of the 
perturbation B(t) — A(t), that 1s, 


(20) K —X-7(B—A)X. 


This follows by direct substitutions. In fact, if U is thought of as defined 
by (18) (which is possible, since det X ~.0), differentiation of (18) gives 


Y’ = X’U + XU’. 
Hence, if Y’, Y, X’ are substituted from (17), (18), (13) respectively, 
BXU = AXU + XU’. 


But this can be written in the form (19), if K = K(t) is defined by (20). 
If (18) is replaced by 


LINEAR VARIATIONS OF CONSTANTS. 191 
(21) Y = VX, 
what results is the 


Rule for the Variation of Constants from the Left. Jf X(t) is a 
fundamental matrix of (11), all fundamental matrices Y(t) of (16), and 
only these matrices, are represented by the product (21) in which V = V(t) 
denotes any solution of 


(22) V’ = B(t) V — VA(t) 
satisfying 
(22 bis) det V(t) ~0. 


The verification proceeds in the same way as before. But the two rules 
are quite different in structure. First, (19) is equivalent to wu’ = K(t)u, 
a homogeneous, linear system of order n, whereas the “ unsymmetric bracket ” 
condition for the elements of the matrix V represents a homogeneous, linear 
system of order n°. Next, the initial fundamental matrix X(t) oceurs, via 
(20), in (19) but does not enter into (22). Finally, the second rule is not a 
true process of variation of constants, since it is based on (21), where, if V 
is replaced by a constant matrix, C, the product CX does not become (in 
general) a sofution of (13). 

To what the second rule actually corresponds is a local formulation of the 
Riemann-Fuchs equivalence problem of species (= %Urt = Poincaré’s espéce). 
If this is compared with Schlesinger’s theory of simply canonical systems,* 
then, since the prototype of the problem of (i) is Cauchy’s elementary system 
zw = Rw (a system possessing the solution matrix 2”), it is clear that 
anything pertaining to (i) should be based on (21), (22), rather than on 
(18), (19), (20), the method of the variation of constants in its strict sense. 


4. Since the substitution 


(23) t (—>) e*, (lim > lim), 


t—+0 


transforms the prototype, ta’ = Re, of (6) into 2’ = Rx, where R = Const., 
it is convenient to transform (6) itself by (23). Then (6) becomes 


—etz’ =F (e*)z, where F(e') + G(e*), 


*L. Schlesinger, “ Ueber eine Klasse von Differentialsystemen beliebiger Ordnung 
mit festen kritischen Punkten,” Crelle’s Journal fiir die reine und angewandte Mathe- 
matik, vol. 141 (1912), pp. 96-145, where further references will be found. 


192 AUREL WINTNER. 


that is, 
aw’ = A(t)a, where A(t) 


Since (11) is equivalent to (13) and (12), this can be written in the form 


(24) X’ =— RX + H(t)X, 
where 
(25) H(t) =— e'G(e*), (t—> 


According to (25), the assumption (5) is equivalent to 
(26) H(t)=O(et) as t>o. 


In particular, the prototype of (25) is X’ —=— RX, where — R = Const. 
Since X = e-** is a solution of this prototype, and since the second of the 
rules of the variation of constants should be applied, what suggests itself for 
(24) is the substitution 


(27) X = Ze tk 
(and not X = e-*#Z, which would correspond to the first rule). But, since 


(Zet®)’ (Z’ — ZR) 
the substitution (27) transforms (24) into 


— ZR — RZ 4- H(t)Z; 


and here the principal term is the Poissonian bracket, (2), of Z: 


(28) Z’ =—[Z]+H(t)Z, where [Z] = RZ—ZR. 


The formal properties of the bracket will now make possible a very easy 


proof for a general “ Abelian” limit theorem. 
5. The lemma in question is as follows: 


(iio) If the cross-modulus of the constant matrix R is less than 1, and 
if the matrix function H(t) is continuous on an half-line, tp St << 0, and 
satisfies (26), then (28) has a (unique) solution Z = Z(t) satisfying 


(29) Z(t) as 


[ Actually, (29) can be improved to 


LINEAR VARIATIONS OF CONSTANTS. 193 
(29 bis) Z(t) =H+ O(e*), 
which, in view of (28) and (26), implies that Z’(t) = O(e-*).] 


Here / denotes the unit matrix. In reality, what concerns the roéle of 
(iio), an arbitrary C= Const. of non-vanishing determinant, rather than just 
C =E, is needed in the boundary condition (29). But the normalization 
C =F is essential in the following proof of (iio) itself, since the non- 
commutative nature of the operations connected with (2) prevents the use of 
an arbitrary C. 

The standard form of the successive approximations to (28) and (29) is 


(30) = B+ [Zm(s)]ds— H(s)Zm(s) ds, 
t 

where 

(31) Z(t) 


The convergence of the integrals (30) will have to be ascertained, of course. 

The point in choosing precisely F in (29) is that Y = £ is the only 
matrix (+40) for which (2) vanishes when F is unspecified. Thus, from (31) 
and (30), 


co 


f H(s)ds. 
t 


This means that 


oO 
if Ym.,(t) denotes the difference 
(33) Y mei (t) = Zm(t) — 


But it is clear from (33) that (30) is equivalent to 


ee 
(34) Vin (t) = [ Y¥m(s) ]ds — H(s)¥m(s)ds, 

since [A — B] = [A] — [B], by (2). Furthermore, (32) and (26) assure 
the existence of Y,(¢). Consequently, the integrals (30) will be proved to 
be convergent if the same is proved of the integrals (34). 

Let r denote the cross-modulus of R. Then, according to (3), (1) 
and (34), 


x 
= 
sy 
d 
d 


194 AUREL WINTNER. 


Sr ff |¥n(s)| ds+2n ff | H(s)| | ¥m(s)| ds. 


t 


Hence, 
Bms(t) Sr Bm(s)ds+ 2nBu(t) | ds 
where 
(35) Bm(t) =1.u. b. | Ym(s)|. 
t=s< 


For the present, Bm(t) = 0 is not precluded. But, by the assumption (26), 
(36) | H(t)| < pe, 


where p is a positive constant. Hence, by the inequality preceding (35), 
oo 
(3%m) <r Bn(s)ds + 
t 
where, according to (32) and (35), 


(37%) Bi (t) <f | H(s)| ds < pe*. 
t 


Thus, from (37;), 
B2(t) < rpe* + 


Hence, if ¢. is so large that 
(38) Qnpe*<e when 


where « > 0 is arbitrary, 
Bo(t) < p(r+eje*. 


Consequently, from (372), 
Bs(t) < p(r +.) (ret + 2npe *') 


and so, if ¢ satisfies (38), 
Be(t) < p(r + €)*e*. 


It is now clear that (37m) and (38) lead to 
Bmu(t) < p(r + 


where 


= oo 
t 


LINEAR VARIATIONS OF CONSTANTS. 
Accordingly, from (35) and (33), 
(39) | Zm(t) —Zms(t)| << pomet if P<t<o, 


if @ denotes the sum 7 + « and ?° is the ¢. belonging to a fixed 6 = &. 

Let e be chosen so as to make 6 =r + « less than 1 (this is possible, since r, 
the cross-modulus of #, is supposed to be less than 1). Then (39) and (31) 
show that the limit process 


Zm(t) > Z(t), m—> © 


defines, by uniform convergence, a (continuous) function Z(t) which satisfies 
the relation (29 bis) and, in view of (36) and (2), the limiting case of the 
recursion formula (30), that is, the identity 


Z(t) E+ [Z(s)]as— H(s)Z(s)ds. 
t t 


Finally, since Z(t) and H(t) are continuous, this identity implies that Z(t) 
has a (continuous) derivative and is a solution of (28). 

This proves (iio), except for the parenthetical assertion claiming the 
uniqueness of Z(/). But this follows by applying the successive approxima- 
tions to the difference of two (allegedly distinct) solutions in the usual manner. 

In order to deduce (ig) from (iio), let X be defined by (12). Then the 
substitutions (23), (25), (27) transform (6) and (5) into (28) and (26) 
respectively. Hence, if the assumptions of (io) are satisfied, it follows from 
(iio) that (6) has a fundamental matrix, V(¢), which is of the form (7), 
where, in view of (23) and (29 bis), 


P(t)=—H+O(t) as t>+0. 
Accordingly, (10) and (9) are satisfied, the limit (8) being the unit matrix. 


6. If Rin (6), (7) is the zero matrix, then (8), (9) remain true if (5) 


is relaxed to 


+0 


This known Abelian fact can be formulated as follows: 


(iiip)) If A(t) is continuous on an half-open interval 0 << tS to and 
behaves, as t—> + 0, so as to become absolutely integrable, and if X(t) denotes 
a fundamental matrix of = A(t)x, then X(+ 0) exists (as a finite limit). 


ox OC 
% 


196 


In fact, if the interval 0 < ¢S ¢, is short enough, then, by the assumptions 


of (iiio), 
to 
(40) nf |A(t)|dtS4 
40 
(n denotes the dimension number). In view of the principle of superposition, 


it is 


assigned by the initial condition 1(t)) = H#. Then, according to the formu- 


latio 


(41) 


wher 


(42) 


(so that, for the present, %m(t) — 2 is allowed), it is clear from (41) and 


(1) 


It follows, therefore, from (40) that @mi(t) Sda%m(t); hence 
= (4)™a,(t). Finally, the difference of the matrices Xo(t), Xi(¢), which 


were 


by the case m —1 of the definition (42). Consequently, am(t) S ($)™. In 


view 


tions is uniform (on the half-open interval 0 << ¢S¢,). Since this, in turn, 


impl 


being continuous), the proof of (iiio) is complete. 


(43) 


(if X(t) is a fundamental matrix). 


from 


AUREL WINTNER. 


The proof is the same as, though of course simpler than, that of (iiy). 


sufficient to prove (iiio) for the fundamental matrix, X¥(¢), which is 


n (13) of (11), the successive approximations are defined by 
nfo 
t to 
e =F; hence X,(t) = E — A(s)ds. But, if 
t 
Om(t) =1.u.b. | Xm(s) —Xm-1(s) |, 


tSsSto 


that 


to 


to 
S f | A(s)| N&%m(s)ds (t) f | A(s)| ds. 


t 


given after (41), is majorized by the integral (40), and so a,(t) S 4, 
of (42), this implies that the convergence of the successive approxima- 


ies that the limit function is uniformly continuous (the functions (41) 


(iiip bis). Under the assumptions of (ilio), 


det X¥(+ 0) #0 


This is a corollary of (iiio). In fact, if (43) were false, it would follow 
(15) and (14), where 0 < ¢ S to, that the integral on the right of (19) 


WwW 


LINEAR VARIATIONS OF CONSTANTS. 197 


is not O(1) as t—>-+ 0. But this is excluded by the (absolute) integrability 
of A(t), which is assumed in (iii). 
Another corollary of (iiio) is the following theorem: 


(jiim) Jf A(t) has on an half-line m(=0) continuous derivatives 
satisfying 


CO 


(4) 


co 


f i™|A™(t)| dt < 0, 


then every solution vector of x= a(t) of = A(t)x possesses an asymptotic 
representation of the form 
m 


(45) a(t) =Xeot*+o(t") as too. 
k=0 


Moreover, if co is any constant vector, there exists one and only one solution 
a(t) satisfying (45), where 


€;(Co), Co = C2(Co); Cm(Co). 
First, the substitution 


(46) t— (lim —> lim) 


t>+0 


transforms 2 A(t) into = *A(t)z, where, if *A(t¢) is denoted just by 
A(t), 

A(t)dt A(t)dt 
holds in virtue of (46). Hence, in virtue of (46), 


ac 


(f 


+0 


But the existence of z(-+ 0) means the existence of some c satisfying z(t) 
=c-+o0(1) ast—-+0. Consequently, the case m = 0 of (ilim) is equivalent 
to (ilio) and (iiip bis) together. 

Next, let A(t) have a first derivative on an interval 0 < tt). Then 


av’ = A’(t)a + A(t)a’, since =A(t)z. 


But what is in this formula line can simply be written as 2 = A(t)a, if 


fe 
h 
n 
l- 
1. 
) 
= 
) 


198 AUREL WINTNER. 


n, 2, A(t) | 
are replaced by 
r A(t) 0 
2 
(7). (5 


respectively, where n is the dimension number and 0 denotes the n-rowed zero 
matrix. In particular, the new A(¢) satisfies the conditions of (iii)) if and 
only if the old A(t) has a continuous first derivative and 


+0 


A(t)| dt < 
+0 


[ Actually, the latter two conditions are tautological, since the convergence of 
the second integral implies that of the first and, as a matter of fact, the 
existence of a finite limit for A(t) as t+ 0.] 

Hence, if the conditions just mentioned are satisfied, it follows from 
(ilio) that the old z(t) and its derivative have finite limits, 7(+ 0) and 
a’(+0). Consequently, if x(t), where 0 << tS to, is defined to be z(+ 0) 
at t 0, then, by a standard theorem of differential calculus, z(t) will have 


a continuous first derivative on the closed interval OS ¢S¢). Accordingly, 
an application of Taylor’s rule gives =2(0) +2/(0)¢-+ 0(t) ast>+0. | 

Finally, it is clear from the remarks following (46) that, in virtue of the | 
transformation (46), the case m —1 of the m + 1 conditions (44) is equiva- 
lent to the pair of conditions represented by the last formula line. This proves | 
(ilim) for the case m=1. And the case of an arbitrary m contains nothing 
new, since the above processes of differentiation can of course be repeated. 


Remark. It is clear from this reduction of (iiim) to (ilio) that (iio) 
implies a theorem, say (iim), which corresponds to (iiim) in the same way as 
(iio) to (iiio). However, the condition r < 1, the first assumption of (iio), 
must then be replaced by r<1/(m-+1). In fact, if m1, then what 
correspond to the dimension number n and to the matrices 


k, Z, H, 


of (iio) become, in (ii,), the dimension number 2n and the matrices 


0 RZ—ZR 0 
ir’ 0 RZ—ZR}’ H’ H 


respectively. Hence, if r denotes the cross-modulus of the old R, all that is 
clear is that the cross-molulus of the new R cannot exceed 2r. Thus, the limi- 


| 
= 
¥ 


ZeTO 
and 


e of 
the 


rom 
and 
-(0) 
lave 
gly, 
+ 0, 
the 
iva- 
oves 
ing 


‘ilo) 
y as 
lig), 
vhat 


at is 
imi- 


LINEAR VARIATIONS OF CONSTANTS. 199 


tation which, either just by the proof or by the true nature of (iim), is imposed 
on the cross-modulus of R becomes with increasing m so severe that no R 
distinct from the k = 0 of (iiio) remains admitted in (iio). 


7. The proof of (io) depended on an application of the second rule for 
the linear variation of constants. In what follows, the first rule will be used 
in order to extend (iiio) in a direction assigned by the applications. To this 
end, it will be convenient to list a few formal facts. 


(I) If X =X (t) is differentiable and of non-vanishing determinant 
at a given ¢, then X~ is differentiable, and has the derivative — X¥7X’X-1, 
at that ¢. 

In fact, since the differentiability of XY means the differentiability of all 
elements of X, it implies the differentiability of the polynomials representing 
det X and its minors. This proves the first assertion of (I) ; whence the second 
can be concluded by differentiating the product XX-', which is ZH = Const. 


(II) If A(¢) is continuous on a t-interval, and if X(t) is a funda- 
mental matrix of 2 — A(t), then the inverse of ¥*(t) (which, by (14), 
exists) is a fundamental matrix of 7’ = — A*(t)z. 

The latter is called the adjoint of 2 — A(t)z, the asterisk being the 
symbol of Hermitian transposition. In particular, 2 — A(t)z is self-adjoint 
when 7A is Hermitian, that is, when A + A* —0 (for every ¢t). 

The assertion of (II) is that the derivative of (X*)* is —A*(X*)7, 
if X’ = AX (and detX #0). But this follows from the rules 


since (AX)* = X*A*, 


If A(t) is continuous (on an open or infinite t-interval) , and if 2’ = A(t)zx 
is self-adjoint, then every solution vector x(t) is a bounded function of t (even 
if A(t) is not). 


This remark, no matter how trivial, contains about all that can be assured 
concerning “ stability ” in general (except when a’ = A(t)a can be integrated 
by known functions, for instance). The sufficiency of A(t) = A*(¢) follows 
from the fact that |2(t)|* then is a first integral, since the (Hermitian) 
scalar product of two solutions is independent of ¢. This is seen by differen- 
tiation. But the true source is the general fact expressed by (II). The latter 
implies that, in the self-adjoint case, (X*)-* is a fundamental matrix of 
a’ = A(t)a, if X is. Hence, by the principle of superposition, X = (X*)“C, 


a | 
| 
i 


200 AUREL WINTNER. 


where C is independent of ¢. In particular, Y = X(t) is unitary (for every 
t) if, without loss of generality, C = £. 

If A = — then every diagonal element of A, and therefore tr 4, is 
purely imaginary. This and the remark italicized above imply the parenthetical 
criterion asserted after (47) in the following theorem: 


(ivo) Suppose that every solution x(t) of a =A(t)x, where A(t) is 
continuous on an half-line, is bounded as 1 > ©, and that 


t 

(47) lim inf wf tr A(s)ds > — @ 
t-—>00 

(both of these assumptions are satisfied in the self-adjoint case, (47) being 

implied by 

(47 bis) RtrA(i) =O 


ast—>o). Let B(t) be any continuous coefficient matrix which is so “ close” 
to A(t) that 


co 


(48) f | B(t) —A(t)| dt < o. 


Then, corresponding to every solution x(t) of x’ = A(t), there exists a solution 
y(t) of = B(t)y satisfying 


y(t) —a2(t) as t>o, 


Moreover, this y(t) is uniquely determined by x(t), every solution of 
y = B(t)y corresponds to a solution of x’ = A(t)a, and the correspondence 


is continuous in terms of the respective initial data x(to), y(to). 


Clearly, the existence of such a correspondence will be proved if it is 
shown that, if X(t), Y(t) are fundamental matrices of 2’ = A(t)x, f = B(t)y 
respectively, the matrix U(t) = Uxy(t) defined by (18) tends to a limit, 
U(co), of non-vanishing determinant. In fact, the last formula line and 
what follows it are then implied by the principle of superposition: A matrix 
is a fundamental matrix of 7 = @(t)z if and only if it is of the form Z7(1)C, 
where Z(t) is an arbitrarily fixed fundamental matrix and C' a unique (but 
unrestricted) constant matrix of non-vanishing determinant. 

If A, X, +0 in (iii)) are replaced by K, U, t—> respectively (cf. 
the mapping (46) and the remarks following this mapping), it is seen from 
(iiig) and (iiip bis) that the existence of a limit U(«) of non-vanishing 


determinant will be proved if it is shown that 


wee 


LINEAR VARIATIONS OF CONSTANTS. 201 


co 


f | K(t)| dt < 0 


is satisfied by the coefficient matrix of wu’ = K(t)u, where U = (m,°*~*, Un). 
But (13), (17) and the definition, (18), of U imply (19) and (20). Hence, 
it is sufficient to show that what is required by the last formula line is 
satisfied when K is the matrix (20). 

In view of (48), this will be proved if it is ascertained that both X(t) 
and X~*(¢) are O(1) ast—>oo. But X(t) = O(1) .is just the assumption 
made before (47). This assumption also implies that every minor of det X (¢) 
is O(1). Hence, X~*(¢) =O(1) can fail to hold only if the denominator, 
represented by det \(¢), comes arbitrarily close to 0 as to. And (15) 
shows that (47) excludes this possibility. 

It is clear from this proof of (ivo) that (ivo) can be refined to a shaeuii, 
say (iVvm), which relates to (ivo) in the same way as (iii,) to (iiio). The 
situation is made clear enough by the remark that the particular case A = 0 
of (ivo) is equivalent to (ilio). 


8. None of the above results contains the classical theorem, mentioned 
in the introduction. But if the theorem is observed to be equivalent to its 
wording (i), in which nothing refers to the theory of elementary divisors, 
there arises the question whether the classical theorem is or is not so general 
as to be independent of the existence of a.Jordan-Weierstrass theory for the 
underlying space. It turns out that the answer is in the affirmative. 

Needless to say, the resulting possibility of generalizing (i) in an abstract 
direction must be interpreted as just a symptom of the fact that neither of 
the classical types of proof, sketched in the introduction, takes into account 
the true formal-algebraic foundations of the problem (which, in reality, rest 
on the properties of the bracket operator). In other words, it will suffice to 
give a direct proof for (i) itself, since the possibility of generalizations will 
then become trivial. Such a proof of (i) will be seen to be contained in the 
following fact: 


Liouvillian Lemma, In case of n-rowed matrices, the (linear) equation 
(49) X —[X]=C, where [X] —=RX—LR, 
has a (unique) solution X = X®(C) satisfying 
(50) if <1. 

Here R is fixed (in accordance with the hypothesis of (50) ), C is arbitrary, 


9 


l 


202 AUREL WINTNER. 


and the sign of absolute value is that defined before (1). A straightforward 
proof results by using, in the fashion of the theory of integral equations and 
infinite bounded matrices (Liouville, Neumann, Schwarz, Hilb), a “resolvent” 
series, as follows: 

Let a scalar parameter, s, be inserted in front of the bracket. Then (49) 
becomes the case s —1 of 


X—s[{[X]=—C, where [X] —=XR— RX. 


If one tries to solve this equation by a power series 


X 


m=0 


where the (unknown) matrices C, are functions of R and C, the comparison 
of equal powers of s clearly leads to the recursion formula Cm — [Cm-1] = 0, 
where m= and Co=C. Hence, Cmn=([C]m, if the subscript of 
the bracket denotes m-fold application of (2) to X =C (when R is fixed), 
Accordingly, if (m) denotes the binomial coefficient, 


1 ) m-k ( m) 
k=0 


Hence, from (1), 
k=0 
which means that 


[Cn |Sn™|C| (ms) | 
k=0 


In view of this inequality, the power series which has been tried for X 
is majorized by 
co 
|C | | |™ 


m=0 


and will, therefore, supply a solution, 1, if this geometric progression is con- 
vergent. In the case of (49), where s—1, this requires the inequality 
2n | R| <1 which, when satisfied, leads to 


m=0 


as claimed in (50). 
The proof of (i) now becomes of such a trivial nature that the compli- 
cated analytical result of the local Fuchsian theory appears as a mere restate- 


rard 
and 
nt 


49) 


con- 
lity 


LINEAR VARIATIONS OF CONSTANTS. 203 


ment of the formal-algebraical fact expressed by the primitive lemma, just 
proved, and of the other “linear” properties of the bracket (2). 


9. The assumption of (i) is that, in W’ = F(z) W, 


co 
(51) F(z) Am", 

m=0 
where #& and Am are constant matrices and the power series (51) converges 
near z= 0. The assertion is that there exists a solution matrix which can be 
factorized into contributions, P(z) and z”, of the secondary part, (51), and 
of the principal part, 2", of the coefficient matrix F(z), in such a way that 
P(z) in 


(52) W(z) = P(z)z® 


becomes, as (51), a regular power series, 


oo 
(53) P(z) = 3 Bm", 
m=0 
which converges near z= 0 and has a determinant which does not vanish 
identically. 
The latter condition requires, of course, the existence of a (unique) non- 
negative integer 11(R) such that the power series (53) becomes 


(53 bis) P(z) =2'(Bi+ where 


After this 1, the index of the first non-vanishing Bm, has been determined, 
all the remaining coefficients Bm of (53) will follow uniquely. Finally, it will 
be shown that the resulting power series (53) is convergent near z= 0. 
In terms of the notation defined before (1), this means that 


(54) | Bm | < b™ 


holds for every m and for some b > 0. Correspondingly, the assumption made 
as to F(z) is that, in (51), 
(55) [Ani 


where a > 0 is independent of m. 
First, if (52) and (51) are substituted into W’ = F(z) W, there results 
for P(z) the differential equation 


P’(2) + 21P(2)R = + 3 Ame”) P(2). 


ison 
= 0, 
t of 
ipli- 
cate- m=0 


204 AUREL WINTNER. 


If a power series'(53) is tried for P(z), this differential equation can be 
written in the form 


= = 27's [Bu + Amz” & Bye, 
k=0 


m=1 m=0 m=0 


8 


where [Bn] = RB,— B,R. Hence, the comparison of the coefficients of 2 


and 2”, where m = 0,1,: - -, gives 

(56) [Bo] =0 

and 

(57) (m + 1) = [Baar] + Cn 


respectively, where Cm is an abbreviation for the matrix which is supplied by 
Cauchy’s multiplication of the last two power series, that is, 


m 


(58) Cm = Am-xBx. 

In order to see that this infinite sequence of conditions for the unknowns 
Bo, B:,:+* can be satisfied, let the arbitrary n-rowed matrix, XY, which occurs 
in the definition, (2), of [X] be thought of as a vector. with n? components. 
Then X — [X] is a linear transformation representable by an n°-rowed matrix 
(which is determined by R, a fixed n-rowed matrix). Consider the charac- 
teristic numbers of this n*-rowed matrix. If one of them happens to be the 
reciprocal value of a positive integer (that is, if there exists a non-negative 
integer, m, for which 


X —s[X]—0, where s=(m-+1)", 


has a solution X ~ 0), let J denote the greatest of these positive integers 
(=—m-+1). Otherwise put Thus /—1(#) is a unique non-negative 
integer which, when it is positive, is characterised by the following pair of 
properties : 

(59) 1X —[X]=—0 


has a solution X ~0, and, if m+1>1, 
(60) (m + 1)X —[X] =C 


has a solution X for every C. And this pair of properties is characteristic of 
1—1(R) in the case 10 also. In fact, (59) then becomes —[X] =0 and 
has therefore the solution X¥ = FE ~ 0, the bracket (2) being the zero matrix 


oO 
| 


be 


LINEAR VARIATIONS OF CONSTANTS. 205 


when X is the unit matrix. [It will be observed that, if 1 > 0, this definition 
of 1 1(£) fails to apply in the case of infinite bounded matrices, first of all 
because the non-existence of a unique bounded reciprocal matrix fails to remain 
equivalent to the existence of a non-trivial solution of the homogeneous 
equations.| The coefficients B, of (53) can now be determined as follows: 

Whether / = 0 or 1 > 0, let Bi be a solution XY ~0 of (59). In the first 
case, this implies that (56) is satisfied. In the second case, that is, if there 
exists at least one Bm preceding Bi, let By =0,---, Bis—O. Then (56) 
remains fulfilled, since the matrix (2) is 0 when X ~0, and (58) becomes 
satisfied for every non-negative m <1, the corresponding matrices Cm being 
Co =0,° Ci, =0. Consequently, (57) is satisfied for every non-negative 
m <1—1 (provided that there exists such an m, i.e., that 1 >1). But (57) 
is satisfied for m—1—1 also (whether />1 or 1—1), since, in view of 
= 0, the condition assigned for Bi by (57) is identical with the 
assumption that B; is a solution of (59). 

Accordingly, whether / = 0 or > 0, the 7+ 1 matrices Bo,- - -, Bi are 
defined so as to satisfy the initial condition (56) and those of the equations 
(57) and (58) which belong to any non-negative m < 1 (provided that there 
exists such an m, i.e., that 1 > 0). Now let m =I (whether >0 orl] =—0). 
Then (58) defines Ci as a function of Bo,- +, Bi. Hence, the case m = of 
(57) becomes a condition for B:,;. But this condition has a unique solution, 
since (60) has a unique solution 1 = X© whenever m+1>1. And it is 
now clear that the possibility of a complete induction, leading from a given 
B,, to the corresponding Bm, is never arrested. 

This proves the existence of a (formal) power series (53 bis). What 
remains to be shown is that this power series has a non-vanishing radius of 
convergence. 

To this end let (57) be multiplied by (m+ 1)*, and let m+1 be 
replaced by m. Then it is seen from (2) that the resulting representation of 
(57) can be written in the form (49), by choosing 


X; 0, R- to be- By; 


respectively. It follows therefore from the Liouvillian lemma, that Bm (exists, 
is unique and) is subject to the inequality 


| Bm | S| |/(m? —2n|R1), 


if the proviso of (50) is satisfied, that is, if the denominator is positive. 
In the present case, this proviso is satisfied whenever the first term, m*, of the 
denominator is large enough, since the second term, — 2n | R |, depends only 


-1 

y 
18 

xX 
le 

3 

f 
d 


206 AUREL WINTNER. 


on the dimension number and the residue, which are fixed. Consequently, 
there exists a positive constant, say d, satisfying 


| Bm | = d | Cm-1 |/m? 


from a certain m = mp» onward. 

In particular, if d is chosen large enough, | Bn |< d|Cmn-1| holds from 
m==(0 onward (provided that C_, is declared to denote 1, for instance), 
Hence, it is seen from (58) and (1) that 


| Cm | Snd¥ | Am-x| | Cra |. 
k=0 


Consequently, if cm denotes the greatest of the values | Cy. |, where-k S m, 


then the last two formula lines imply that | Bn | —=O(cm) and 
m 
Cm+1 = ndCm 
k-0 


by (55). Hence, 


= NdCmO(a™), where =max (2,¢a), 


Cm == O(8"), where B—nda. 


and so, by induction, 


It follows therefore from | Bm | = O(cm) that (54) is satisfied by some b > 0, 
which proves that (53) has a circle of convergence. 


10. For a definitive nature of the formal algebra in the above treatment 
of a Fuchsian point, a test case quite different from the resulting possibility 
of extending the Fuchsian theory to generalized linear spaces will now be 
considered. In fact, the Fuchsian character of the singular point, that is, the 
assumption of a vanishing rank (Poincaré), will now be transferred to the 
non-degenerate case of a positive rank, leading to normal series (Thomé). 

In this case of (in general) divergent expansions, the standard formal 
complications arising from a multiple invariant factor, and (even if all these 
factors are simple) from a characteristic exponent which is multiple (mod 1), 
are not usually treated in detail. But it turns out that, in the proof for the 
existence of the full number of formal expansions, the classical method of 
counting the constants, that is, the machinery of the theory of elementary 
divisors, achieves, again, more harm than good. In fact, a more direct procedure 
is able to lead to explicit results corresponding to the factor z* = exp (# log z) 
in (52). 


ly, 


ym 


m, 


LINEAR VARIATIONS OF CONSTANTS. 207 


The reasons for the above wording, (i), of the theorem dealing with the 
case of rank 0, and for the possibility of a straightforward verification of this 
wording, were two-fold: The first factor in (52), that is, the perturbation 
due to (51), was automatically prescribed by the “second method of the 
variations of constants,” and the explicit factor, z*, in (52) was a funda- 
mental matrix of the undisturbed system. In view of (51), the latter belongs 
to the coefficient matrix F(z) =z", that is, to Cauchy’s elementary system, 


_2W’ = RW, where 2 = Const. Precisely this system and this reasoning are 


the formal foundations of the non-local theory of Schlesinger, quoted above. 

Correspondingly, the prototype of a singularity of arbitrary rank, say p, 
is the singularity of the system 2***W’ = RW (at z—0), where R = Const. 
But a differentiation verifies that a solution matrix, W=W/(z), of this 
prototype is simply 


(61) Ryu(z) = exp 


(if p40; if »—0, then —yp™*z* must be interpreted as its limit when 
p—0, that is, as log z; so that Ro(z) becomes exp (RP log z) = 2%). Conse- 
quently, if the trivial coefficient matrix of the prototype is disturbed some- 
what, the “second method of the variation of constants” assigns a fundamental 
matrix of the form P(z)Ry(z), where the matrix P (which must be written 
in front of Ry») represents the perturbation. 

It turns out that this plan, leading to an explicit representation, (61), 
of the principal terms in a complete set of linearly independent “ normal” 
(or “anormal” ) series, can be carried out without any difficulty. The reason 
is precisely the avoidance of characteristic numbers and elementary divisors, 
which only disguise the simple exponential factor, (61). In other words, the 
leading terms of the expansions, the terms responsible for the formal diffi- 
culties of the usual treatment, can be split off en bloc, since they happen to be 
identical with the exact solution, (61), of the elementary prototype. This is 
the content of the following formal extension of (i), where 4 = 0, to the case 
of an arbitrary rank p: 


(i*) If the n-rowed matrix F is an analytic function having a pole at 
the point z =0 of the z-plane, then w’ = F(z)w has a formal solution matriz 
of the form P(z)Ru(z), where the matrix P(z) is a power series (53), with 
a (unique) non-negative 1=1(R) in (53 bis), and the second factor is the 
exponential matrix (61) in which »-+ 1 denotes the order of the pole, and 
R= Const. the coefficient of the leading term, of F(z) at z=0O (that 1s, 
z##F'(z) has a simple pole, of residue R, at z= 0, the subscript of (61) being 
identical with Poincaré’s non-negative index of rank). 


2), 
|_| 
0, 
ont 
ity 
be 
he 
he 
1al 
1), 
he 
of 
ry 
ire 
z) 


208 AUREL WINTNER. 
In other words, the assumption is that 
F(z) = H(z), 


where H(z) has at z 0 a pole of order »(= 0), at most. And the assertion 
is that W’ = F'(z)W can formally be satisfied by 


W(z) = P(z)Ru(z), 


where P(z) is some power series of the form (58), (53 bis). 
First, the substitutions defined by the last two formula lines transform 
W’ = FW into 
+ = + HPRz. 
Since, by (61), 
Ry! = 277 


it follows that the differential equation to be satisfied by P = P(z) can be 
written in the form 
P= + 


if use is made of the abbreviation (2). 

If » = 0, the assertion of (i*) is contained in that of (i). Since, as will 
be clear from the proof, there is no difference between the treatment of » = 1 
and that of an arbitrary » > 0, let the formulae be curtailed by choosing 
#=1. Then, the pole of H(z) being of order p» (at most), 


P =2°(P] + + G(z)P, 
where S denotes the residue of H(z) at z=0 and G(z) is a regular power 


series, 


Thus, from (53) and (2), 


m=0 m=0 m=0 m=0 k=0 


If this is written in the form 


mByz™ = [Bn]z™ + SBnz™ + (3 2, 


m=1 m=0 m=0 m=0 k=0 


then. since S —Const., the comparison of the coefficients of z, 2° and 
> 
- is seen to lead to 


ox 
G(z) => Ame". 
m=O 
ee) a0 oe 
00 


ion 


rm 


be 


rer 


nd 


LINEAR VARIATIONS OF CONSTANTS. 


0 [Bo], B, = [Bi] + SB, 
and 


k=0 


‘respectively, where m = 2,3,---. But the first of these three conditions is 
precisely (56), whereas the second and the third can be united, and appear in 
the form (57), where m + 1 commences with m + 11 and, in view of the 
last two formula lines, Cm must be defined as follows: 


m—1 
(62) if m>0. 
k=0 


Accordingly, the full system of conditions is just the same as above, since 
it consists of (56) and (57), where m =0,1,-'::-.. However, (58) must 
now be replaced by (62). But this shift in the structure of Cn prevents a 
repetition of the proof of convergence by means of a dominating geometrical 
progression, as given above for the case, » = 0, of (i). And, as is well known, 
the radius of convergence of the resulting. power series (53) is 0 even in the 
simplest examples of rank p»=1>£ 0. 

Correspondingly, (i*) does not contain any statement concerning the 
existence of actual solutions (functions) to which the formal power series 


“belong.” However, it could be proved that the formal power series, (53 bis), 
is summable in Borel’s sense (if »—1, and in the sense of Le Roy’s 


generalized Laplace-Borel transforms, if » is arbitrary). 


11. In his fundamental paper referred to above (and, at least between 
the lines, in his earlier investigations which he mentions but does not pre- 
suppose there), Schlesinger is led to the quadratic system of +1 matrix 
differential equations 


XQ = (¢ [Xo, Xm | B| AB BA) 
(63 
Xn! = (t— Om )*[Xm, Xo] m=1,:--,k, 
where are (distinct) points of the complex t-plane and Xo,-- Xm 


are n-rowed matrices. Hence, in terms of scalars, 74, 
(64) x,’ = Li 3 Ups t) Qi Lp; t), 


where the order, p, of the system is (k +1)n?, and LZ; and Q; denote linear 
and quadratic forms respectively (actually, Z; vanishes identically). However, 
it is clear that (63) has the linear integral 


209 
ill 
1 


210 AUREL WINTNER. 


(63 bis) X;(t) = const., 

j=0 
by means of which it is reducible to a system of the form (64) and of order 
p=kn’. 

By connecting (63) with his schlechthin canonical linear systems of the 
Fuchsian type, Schlesinger recognizes in the quadratic system (63) a class of 
non-linear differential equations (substantially of order mn*) which, precisely 
because of its connection with the Riemann-Fuchs problem, has only solutions 
with fixed “critical” singularities. In other words, all movable singularities 
of any of the functions 2;(t) (that is, those of their singular points, t = ?°, 
which depend on integration constants) are poles. And the fact is that, no 
matter how transparent the function-theoretical situation may be, the formu- 
lation (64) of (63) exhausts, in case of the lowest values of p, all types con- 
tained in the work of Painlevé and his pupils, and leads, with increasing p, 
to an infinity of new “ Painlevé transcendents.” 

It seems to be worth observing (if it has not been observed before), that 
there is a class of systems of the same type as (63) but depending on purely 
formal considerations, rather than on function-theoretical arguments. The 
systems in question are again of the form (64), result again from linear 
matrix differential equations, and all the critical singularities are again fixed, 
but this time for an explicit reason: The solutions X(¢) depend on the 
reciprocal matrices U-' = U-*(t) of solutions U = U(t) of a linear differen- 
tial equation of the second order for the matrix U (cf. (70) and (71) below), 
and so the only movable singular points, ¢ = ?¢°, of the elements, x; (¢), of the 
matrix X(¢) are those ai which the determinant of a particular U(t) happens 
to vanish; a situation corresponding to that dealt within the theory of con- 
jugate points. 

12. The nature of the class in question being of a formal origin, it is 
unnecessary to assume analyticity. In fact, all that is needed is a transcription 
of Riccati’s quadratic equation, 

(65) =a(t) + c(t)2’, 

to the case of matrices. The sole trick is that the quadratic term, c(t)z’, 
in (65) must then be written as zc(t)z: , 
(66) X’ = A(t) + B(t)X + XC (t)X. 

If the matrices are n-rowed, (66) is a system (64) of order p= n°’. 

In order to simplify the situation, suppose first that the third coefficient 
of (66) is the unit matrix: 


(67) X’ == A(t) + B(t)X + X?. 


er 


2, 


ent 


LINEAR VARIATIONS OF CONSTANTS. 211 


As to A(t) and B(t), it is sufficient to assume mere continuity on a ¢-interval. 
Every pair of initial vectors, say u(to) and w’(to), determines a solution 
vector w= u(t) of 


(68) —B(t)w + A(t)u=0. 

If Un(t) are n solution vectors of (68), the matrix 
(69) U(t) = Un) 

is a solution of 

(70) U” — B(t)U’ + A(t)U =0, 


and conversely. However, even if the columns of (69) are linearly inde- 
pendent, the n-rowed matrix (69) cannot be a fundamental matrix of (68), 
since (68), being of order 2n, has 2n, instead of just n, linearly independent 
solutions w= u(t). In particular, det U(t) can have an isolated zero, say 
t= 1°, within the ¢-interval of continuity (or, for that matter, regular- 
analyticity) of the coefficient functions A(t), B(t). 

It is precisely the possibility of such zeros t = ¢° that leads to “ movable 
singularities ” of X(¢). All the other “singular” points of X(t) are “ fixed,” 


namely, such as to be “singularities” of the coefficient functions A(t), B(t) 
themselves. For, on the one hand, (70) is linear and, on the other hand, the 


connection between (70) and (67) is simply as follows: 
(71) X =— 


Here U = U(t) denotes any set (69) of linearly independent solutions 
of (68). Hence, det U(t) does not vanish identically. Consider a t-interval 


on which det U(¢) has no zero. On such an interval, (71) defines a function 
X = X(t). The latter has a continuous first derivative, since U = U(t), 
being a solution of (70), has a continuous second derivative. But, since 
(U-1)’ = — U"'U’U™ (cf. p. 199 above), the derivative of (71) is 


X’ = — + 


Hence, direct substitutions show that (67) becomes an identity in ¢ by virtue 
of (71) and (70). 

In the more general case, (66), the assumption of mere continuity remains 
sufficient for A(t) and B(t), but C(t) must be assumed to have a continuous 
first derivative and a non-vanishing determinant. In fact, (71) must now be 
replaced by 


(72) X¥ =—CU'U, 


and (70) by 


le 
of 
ly 
ns 
es 
no 
u- 
n- 
P; 
vat 
ly 
he 
“ar 
ad, 
he 
n- 
ar 
he 
ns 
m- § 
is 
ion 


212 AUREL WINTNER. 


(73) C(t)U” —*B(T)U’ + A(t)U =0, 
where *B is an abbreviation for 
(74) *B = BC+ 
The verification is the same as before, the derivative of (72) being 
X’ = DU’U- — CU” U + CU’UU,7 


where D denotes the second term, C-*C’C-1, of (74). 
It will be observed that, in view of (69), the system (70) is equivalent to 


C(t)u” — *B(t)w’ + A(t)u=0, 


a linear system of order 2n, whereas (66) is a quadratic system of order n’. 
Thus, the reduction effected by Riccati’s substitution, (72), in the classical 
case, 2 = 1, is quite accidental, since 2n becomes equal to n* when n = 2, 
and is less than n* from n=3 onward. It follows that, in contrast to the 
classical case, the connection between (66) and (73) does not play the role 
of “reducing the order,” a rdle somewhat insignificant since about Riemann’s 
time, but rather the réle of a function-theoretical reduction, leading, via a 
linear system, to a non-linear system with fixed critical singularities. 

The exceptional standing of the scalar case in question concerning a 


reduction of 
(75) + P(t)y + Q(t)y =0 


can also be seen from what results if a “ Tschirnhaus transformation” is tried 

‘for (75), where both n-rowed matrices, P(t) and Q(t), are given as con- 
tinuous functions on a ¢t-interval. In fact, if n= 1, just the quadrature of 
P(t) is needed in order to transform (75) into an_equation, say 


in which the first derivative does not occur. But if n > 1, what is needed to 


this end is the general solution of 


(77) wu =—4P(t)u, 


that is, of a system of order n (so that (76) and (77) together represent a 
system of order 3n, whereas (75) itself is of order 2n). That the general 
solution of (77), that is, a fundamental matrix (69), actually leads to a 
reduction of (75) to (76), can be verified by what amounts to an application 


LINEAR VARIATIONS OF CONSTANTS. 213 


of the variation of constants which, however, fails under the mere assumption 
of continuity for the matrices P(t), Q(t). 

Suppose therefore that P(t) has a continuous first derivative. Then every 
fundamental matrix, U(t¢), of (77) has a continuous second derivative. In 


order to “vary the constants,” put 


(78) — UZ, 


where Y = Y(¢) denotes a matrix formed by n linearly independent solution 
vectors of (75). Then Z Z(t) supplies a corresponding solution of (76), 
where the coefficient matrix is the continuous function R = R(t) defined by 


(79) R = — 30" (P’ —4P? — 2Q)U 


(U~ exists, since U = U(t) is a fundamental matrix). 
In fact, from (78), 


If this pair of relations and (78) are substituted into what results when y is 
replaced by Y in (75), then, since the assumption (77) is equivalent to 
2U’ + PU =0, it follows that 


UZ” +- (U” + PU’ + QU)Z =0. 
Since 
U’ =—43PU, hence U” ~—3(P’ —3P’)U, 


this can be written in the form Z” +- RZ =0 or (76), if R is defined by (79). 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


The report of E. Hilb, “Lineare Differentialgleichungen im komplexen Gebiet,” 
Encyklopidie der mathematischen Wissenschaften, vol. II,, pp. 471-562, more particu- 
larly pp. 473-494, gives a full account (up to 1915) of the extended literature of the 
local problems referred to above. As to the classical beginnings of the theory, cf. also 
the comments of O. Haupt at the end of his edition of F. Klein’s Vorlesungen iiber die 
hypergeometrische Funktion, Berlin, 1933. 


) 
| 


FIXED POINT THEOREMS FOR MULTI-VALUED 
TRANSFORMATIONS.* 


By SAMUEL EILENBERG and DEANE MonTGOMERY. 


1, Introduction. Recently there have been several extensions of known 
fixed point theorems in which the transformation 7 takes each point of a com- 
pact metric space Minto a closed subset of M. For such a transformation a 
point z is said to be a fixed point if x is in T(x). These extensions first 
occurred in von Neumann’s work on the theory of games (see [8] where 
earlier references are also given). Kakutani [3] proved a theorem which we 
shall formulate below, and Wallace [9] also has a theorem in this direction. 

Our purpose in the present paper is to present a general fixed point 
theorem which on the one hand includes a very general form of the famous 
fixed point formula of Lefschetz [4], [5], [6] and on the other hand implies 
the fixed point theorems of Kakutani and Wallace. 

We rely heavily on a theorem proved by Vietoris [7] and with this as a 
tool the proof for the case at hand resembles the proof given by Lefschetz. 


2. Definitions and theorems. Let M and N be metric compact spaces 
and let 7: M— N be a multi-valued function, i. e., a correspondence which to 
each ze M assigns one or more points of NV. For every xe M, T(x) will denote 
the set of all “images” of z. The function T is continuous provided tn — 2, 
yn—> y and yneT' (an) imply ye T(x). The graph T of T is the subset of the 
cartesian product M X N consisting of points (x,y) with yeT' (x). The conti- 
nuity of T is equivalent with the condition that T is closed. The continuity of 
T implies that the sets T'(x) are closed. If we regard T as a point-to-set func- 
tion then the continuity of the multi-valued function T is equivalent with the 
upper semi-continuity of the set-valued function. 

Unless otherwise stated, we shall use Vietoris cycles and homologies over 
a field F of coefficients. 


Definition. A compact metric space X is said to be acyclic provided 
1°) X is non-vacuous, 2°) the homology groups Hg(x) vanish for q > 0, 
3°) the reduced 0-th homology group H(z) vanishes. 


* Received January 4, 1946. 


214 


~ 


FIXED POINT THEOREMS FOR MULTI-VALUED TRANSFORMATIONS, 215 


The reduced 0-th homology group is obtained by considering only cycles 
in which the sum of coefficients is zero. 


THEOREM 1. Let M be an acyclic absolute neighborhood retract and 
T:M—M a continuous multi-valued function such that for every xe M the 
set T(x) is acyclic. Then T has a fixed point. 


Since every compact convex set in a Euclidean space is an absolute 
neighborhood retract and is acyclic, Theorem 1 yields: 


KAKUTANI’s THEOREM. Let M be a compact convex subset of Euclidean 
n-space and let T: M—>M be a continuous multi-valued function such that 
for every xe M the set T(x) ts conver. Then T has a fixed point. 


Wallace’s theorem (in the metric case) follows from Theorem 1 by assum- 
ing that M is a tree. We also note that the case when M is an acyclic (finite) 
polyhedron is included in Theorem 1 since polyhedra are absolute neighborhood 
retracts. 


3. Reformulation of the problem. We first formulate Theorem 1 in 
a somewhat more general fashion. 


THEOREM 2. Let M be anacyclic absolute neighborhood retract, N a com- 
pact metric space, r: N—> M a continuous single valued mapping and T: M—>N 
a multi-valued continuous mapping such that all the sets T(x) are acyclic for 

-@eM. Then the combined (multi-valued) mapping rT: M—M has a fixed 
point. 


Taking N = M and r(x) —2 for all xe M yields Theorem 1. 


Definition. A continuous mapping f: N — M, where N and M are com- 
pact metric spaces will be said to have property (V) provided for every re M 
the antecedent set f-*(x) is acyclic. 

It follows that if f satisfies property (V) then it maps N onto M. 

Using the property (V) we can replace Theorem 2 by an equivalent 
theorem involving only singled valued transformations. 


THEOREM 3. Let M be anacyclic absolute neighborhood retract, N a com- 
pact metric space and r:N—M,t:N—M continuous mappings. If t satis- 
fies property (V) then r and t have a coincidence, 1. e., r(x) =t(z) for some 
re N, 


We first show that Theorem 3 implies Theorem 2. Let M, N, r and T 


216 SAMUEL EILENBERG AND DEANE MONTGOMERY. 


satisfy the conditions of Theorem 2. Consider in the cartesian product M X N, 


the subset V* consisting of all points of the form (z,y) where xe M, ye T(x). 
Define the mappings 7*: N*-—> M and t*: N* > M by setting 


r*(2,y)—r(y), =a. 


Since ¢*-*(a) is homeomorphic with 7'(x), the mapping ¢* satisfies property 
(V). Hence by Theorem 3 r* and ¢* have a coincidence; this means. that 
there is an xe M anda yeT(z) such that r(y) =z. Consequently re T(z). 

Conversely suppose that Theorem 2 holds. Let M, N, r and ¢ satisfy the 
conditions of Theorem 3. Define (xz) for each ze M. It is then 
clear that M, N, 7 and T satisfy the conditions of Theorem 2 and there is a 
point xe M such that Hence the sets and = T(z) 
intersect. Let 2’ ; then r(2’) = t(2’). 


4, The Lefschetz number. Given a continuous mapping f: N > M of 
a compact metric space N into another such space M we shall denote by f; the 
homomorphism of the homology groups, fi: Hi(N) — Hi(M), induced by f. 
The following theorem established by Vietoris [7] is of fundamental im- 


portance here. 


VieToris’ THEOREM. If t: N— WM has property (V) then t; maps H;i(N) 
isomorphically onto Hi(M). 


We are now in a position to formulate the theorem that we propose to 
prove. 


THEOREM 4. Let M be an absolute neighborhood retract, N a compact 
metric space, r:N—M, t:N—-M continuous mappings of which ¢ satisfies 
property (V). Consider the Lefschetz number: 


A(r, t) trace (rity). 
If A(r,t) 40 then r and t have a coincidence. 


We first note that ¢;-1 is defined in view of Vietoris’ Theorem. Further 
since M is an absolute neighborhood retract Hi(M) is a finite dimensional 
vector space (over the coefficient field) and the trace of the homomorphism 


rity: Hy(M) > Hi(M) 


is defined. Finally since Hi(M) 0 for sufficiently large 1 the summation 
1) trace (riti*) is actually finite and A(r,t) is defined. 


FIXED POINT THEOREMS FOR MULTI-VALUED TRANSFORMATIONS. 217 


If M is acyclic then Hi(M)—0O for i>0 and A(r,t) 1. Hence 
Theorem 4 implies Theorem 3. 

In the next section we shall prove Theorem 4 assuming that M is a poly- 
tope. From this partial result we derive in Section 6 Theorem 4 in full 


generality. 

Before proceeding we point out that Theorem 4 implies Theorem 5 which 
is an extension of the Lefschetz fixed point formula. Let M be an absolute 
neighborhood retract and let 7: M—>M be a continuous multi-valued function 
such that 7’(x) is acyclic. In the product of M and M let N be the graph of T. 
Define the mappings r: NM and t: N— WM as follows: 


r(z,y) =Y, y) = 2. 
Then ¢ satisfies property (V) and we may form the Lefschetz number 
A(T) = A(r, t) = 3(—1)# trace (riti). 
Theorem 4 implies 


THEOREM 5. Let M be an absolute neighborhood retract and let T: 
be a continuous multi-valued function such that T(x) is acyclic for each xe M. 
If A(T) ~0 then T has a fixed point. 


5. Proof of Theorem 4 for polyhedra, Assume that M is a finite 
polyhedron given in a simplicial decomposition, Yt. For each xe M denote by 
v(x) a vertex of the lowest dimensional simplex containing z. We may select 
e > 0 such that 

(1) for every «-chain ¢ in M, v(c) is a chain on M. 

Assume now that r and ¢ do not have a coincidence; we may then select 
the simplicial decomposition Yt so fine that 

(2) for each xe N the vertices v(r(x)) and v(¢t(x)) are not in the same 
simplex of Yt. 

Since r and ¢ are continuous we may select «, > 0 so that 

(3) for every ¢:-chain c in N, r(c) and ¢(c) are e-chains in M. 

Further it follows from Vietoris’ theorem that we may select €2 such that 
0< «& <, with the property that 


(4) if c is an ¢,-cycle in NV and vt(c) ~0 in Mt, then c~ 0 in N. 
1 


Let s be any simplex of Mt. Since s is acyclic it follows from Vietoris’ 


3 


? 
N, @ 
4 
ty 
iat 
). 
he § 
en 
a 
t) 
of 
he 
m- 
0 
ct 
es 
1 
| 


218 SAMUEL EILENBERG AND DEANE MONTGOMERY. 


theorem that f*(s) is acyclic. Hence we may select a sequence 0 << < 
<n < € where n is the dimension of such that 
(5) if is a (¢g—1)-dimensional 7i-,-cycle in ¢-*(s), where s is any 
q-simplex of Mt, then in f(s). 
For g =1 we assume in (5) that the zero-dimensional cycle c has the 
sum of coefficients zero. 
We shall now define a chain transformation r with the following properties, 
(6) For every i-dimensional chain ¢ on Yt, rc is an i-dimensional ;-chain 
in JN. 
(7) Or(c) =7(0c). 
(8) + =7(¢,) + 7(C2). 
(9) +(fce) =fr(c) for every f in the coefficient field F. 
(10) where |c| is the smallest sub-complex of 
containing c. 
(11) vtr(c) —c. 
If c is a vertex in Yt taken with multiplicity one, we define r(c) to be 
any point of ¢*(c), also taken with multiplicity one. Conditions (8) and (9) 
then, determine r(c) for every 0-chain c. Suppose that the construction of + (c¢) 
has been carried out for dimensions < i and let c be an oriented i-dimensional 
simplex of Yt taken with multiplicity 1. Since |] = 0, it fol- 
lows that +(@c) is an (i ——1)-dimensional in ¢*(|c|). Hence by 
(5) there is an mi-chain te in ¢*(|}¢|) such that dr(c) =7(0c). Further 
vir(c) is a chain of Mt on |c| and d@[vtr(c)] = vtr(dc) =dc. Hence 
vtr(c) =c. Using (8) and (9) the definition is extended to arbitrary 
i-dimensional chains on Yt. 
Define 


(12) pc = vrr(c) 


for every chain c on Mt. Since ni < «2 < «: it follows from (6), (3) and (1) 
that pc is a well defined chain on Yt. By (7), (8) and (9), p is a chain 


transformation, 1. e., 
Op(c) =pO(c), p(t: t+ Cz) =p(ci) +p(cz), p(fc) =fp(c). 


Consider the homomorphisms on the groups of chains 


pi: Ci (Dt) —> Ci (Mt) 


| 


any 


the 


1e8, 


ain 


M 


ain 


FIXED POINT THEOREMS FOR MULTI-VALUED TRANSFORMATIONS. 219 


and the induced homomorphisms on the homology groups 
Pi Hi; (Mt) —> Hy (M). 


The well known argument involving the additivity of the traces implies 


[1], [5] 


trace pj = 3(—1)* trace pi. 
From (11), (12) and (2) it follows that trace pip —0. Hence 
(13) 1)? trace pp —0. 


Let z be any cycle in Yt. By Vietoris’ theorem there is a convergent cycle 
Z = (%, 22° in N such that t(Z) ~z. For sufficiently large n, — Zn 


is an €2-cycle in N and vt(r(z) — 2n) = vtr(z) — vlin = 2 — vlan ~ 0. There- 
fore by (4), 7(z) in N. Consequently, by (1), p(z) = vrr(z) ~ 
in M. This shows that p(z) ~r(Z) while z~¢(Z). If we recall the defini- 
tion of the homomorphisms rj; and ¢;, this shows that pi = riti*. Hence (13) 
implies that 

trace (riti*) 
contrary to assumption. 


6. Completion of the proof. As a preliminary we dispose of the case 
when M is the cartesian product of a polytope P and the Hilbert cube Qu. 
By considering the first n-codrdinates of Qo we represent QYw as the cartesian 
product Qn < Ry of an n-dimensional cube Qn and a Hilbert cube Rn. Define 
Pr =P X Qn; then M = Py X Rn. Let mn: M — Pn be the natural projection 
mn: (@,y) —> x. For every xe Pn the set mn is homeomorphic with Rn. 
Hence z» satisfies condition (V). Consider now the maps 


and the maps 
ln = = ant: Pr. 


Since both a» and ¢ satisfy condition (V), so does the combined map tn = mnt. 
Since by Vietoris’ theorem 7; maps H;(M) isomorphically onto Hi(Pn) 
we have 


= ani. 


Hence trace (fnitni-?) = trace (iti?) and therefore = A(1n, tn). 
Assume now that A(r,t) 40. Since Pp is a polytope and A(rn, tn) #0, 


tm and t, have a coincidence. Hence anr(an) =ant(an) for some ane 


If x is a limit point of the set {an}, it follows that r(x) = ¢(z). 


| 

be 

9) 
nal 
ol- 
by 
ner 

1ce 

uy 

1) 


220 SAMUEL EILENBERG AND DEANE MONTGOMERY. 


We now pass to the general case when M is an absolute neighborhood 
retract. We may assume that M is a subset of a larger space D with the 
following properties [2]: 

(1) Dis the cartesian product of some polytope with a Hilbert cube, 

(2) M isa retract of D. 


Let p: D—M be a map retracting M to D (i.e., p(x) =a for re M), 

Consider in the cartesian product D x N the set N, consisting of all 
points (x,y) such that <M and ¢(y) =z. For (z,y) in N, define r,(z, y) 
=r(y) and ¢,(z,y) =t(y) =a. The map (2, ¢(z)) sets up a homeo- 
morphism N — N,, t->t,, r—r, and therefore 


(3) A(r,t) =A(m, 


Next consider in D X N the set N2 consisting of points (x,y) satisfying 
the condition p(z) =t(y). For (z,y) in Ne define 


ro(z,y)=r(y), 


Clearly r.: N2— D and t2:N2— D. For x« D the set t.-'() is the set of all 

pairs (z,y) «DX WN satisfying the condition p(x) =t(y). Hence 

is homeomorphic with ¢*(p(z)), and therefore f¢2 satisfies condition (V). 
We shall prove that 

(4) A(fe, to) == t;). 


First note that N,C Nz and that r; —rz and t; —¢2 on N;. Since M is a 
retract of D, every cycle of M that bounds in D also bounds in M. Conse- 
quently we may regard H;(M) as a subgroup of Hi(D). Consider the 


homomorphisms 
Tottes H,(D) Hi(D), H,(M) H; (M). 


These two homomorphisms agree on H;(M). Moreover since rz maps N:2 into 
M the image of the homomorphism r2;f2;* is in the subgroup H;(M). Hence 
the theorem on the additivity of traces implies trace (rait2i-1) = trace (riits7*) 
and (4) follows. 

Assume now that A(r,¢) 40. From (3) and (4) it then follows that 
A(r2, t2) #0. Because of (1) the theorem can be applied to yield a coinci- 
dence for rz and t2. Hence there is a pair xe D, ye N such that 


p(x) =t(y), rly) 


od 
he 


all 


ng 


FIXED POINT THEOREMS FOR MULTI-VALUED TRANSFORMATIONS. 221 


The second condition implies that ze M, hence p(x) =z and r(y) =t(y), 
q.e. d. 


7. A special case. In the special case when M is a euclidean n-cell 
Theorem 1 can be given a more direct and geometric proof leading to a slightly 
more general result. 

We shall denote the euclidean’n-space by The elements ze will be 
considered as vectors with | z| the norm of x. The n-cell Hand the (n —1)- 
sphere S are defined by the conditions <1, |«|—1. 


THEOREM 6. Let T: E—-R be a continuous multi-valued function such 
that T(x) CE for xe S. If for some coefficient group G30 all the sets 
T(x) are acyclic then T has a fixed point. 


Proof. Consider the euclidean 2n-space R X R, compactified to the 
2n-sphere k X R by the addition of a point oo. In R X R consider the set 
consisting of those pairs (x,y) satisfying the conditions ve and ye T(z). 
The subset I'y of Ig is defined by requiring ze 8. 


Consider the mappings re: Tzg—> X 0 and rg: Ts > S X 0 defined by 
(t,y) — (2,0). Both rpg and rg satisfy condition (V) and therefore by 
Vietoris’ theorem 


(1) There is an (n—1)-dimensional cycle Z in I's such that Z~ 0 in 
lz but rg(Z) non ~ 0 on S X 0. 

The cycle rg(Z) links some n-dimensional (integral) cycle on 0 X R. 
Since Z and rs(Z) are homologous (even homotopic) outside of 0 X F it 
follows that 


(2) Z links some (integral) cycle on 0 X R. 


Consider the diagonal D consisting of points of the form (z,xz). Consider 
the homotopy, 


he(0, 2) = (tz,2), At(o)=—o for OStZ1 


which deforms 0 < FP isotopically onto D. 

Suppose first that the path of this homotopy intersects T's. Then 
«Ty or | te | and (tx). Since T(x) C for xe 8, it follows 
This gives a fixed point. 

If the path of the homotopy ht does not intersect T's then by (2) Z links 


il] 
a 
e- 
he 

ce 
*) 
at 


222 SAMUEL EILENBERG AND DEANE MONTGOMERY. 


some cycle of D. Since Z~0O in Tz it follows that Tyo D0. 
Tle® D~0 and T must have a fixed point. 


UNIVERSITY OF MICHIGAN 
AND 
PRINCETON UNIVERSITY. 
SMITH COLLEGE 
AND 
THE INSTITUTE FOR ADVANCED STupDY. 


BIBLIOGRAPHY. 


1. Alexandroff, P. and Hopf, H., Topologie I, Berlin 1935. 


2. Borsuk, K., “Sur le plongement des espaces dans les rétractes absolus,” 


menta Mathematicae, vol. 27 (1936), pp. 239-243. 


Hence 


Funda- 


3. Kakutani, S., “A generalization of Brouwer’s fixed point theorem,’ Duke Mathe- 


matical Journal, vol. 8 (1941), pp. 457-459. 


4, Lefschetz, S., “On the fixed point formula,” Annals of Mathematics, vol. 38 (1937), 


pp. 819-822. 
Lefschetz, 8., Algebraic topology, New York 1942. 


or 


6. Lefschetz, S., “ Topics in topology,” Annals of Mathematical Studies 10, Princeton 


1942. 


og 


Vietoris, L., “ Uber den héheren Zusammenhang kompakter Riiume und eine Klasse 


von zusammenhangstreuen Abbildungen,’ Mathematische Annalen, vol. 97 


(1927), pp. 454-472. 


8. Von Neumann, J. and Morgenstern, O., Theory of games and economic behavior, 


Princeton 1944. 


9. Wallace, A. D., “ A fixed point theorem for trees,” Bulletin of the American Mathe. 


matical Society, vol. 47 (1941), pp. 757-760. 


THE VALUES OF THE NORMS IN ALGEBRAIC NUMBER FIELDS.* 


By AUREL WINTNER. 


1. Let (n), where n—1,2,--~-, be a function the values of 
which are non-negative integers k. Let 7(m) denote the probability that P 
is in its k-th state when n does not exceed m, that is, let m times ax(m) be 
the number of those among the first m positive integers n for which F(n) 
attains the value k. If the limit x,(0) exists for every k (—0,1,-- -) and 
if the sum of the infinite series m(0) + 7(0) +--- is 1, the function 
F(n) is said to have an asymptotic distribution function. The latter is repre- 
sented by the monotone step-function «(2), — 0 <a < which is 0 when 
<0 and has the (non-negative) jump at c=k. 

The second of the assumptions required for the existence of an asymptotic 
distribution function is independent of the first. In other words, if the limit 
m,:(0) exists for every k, the “total probability” can be distinct from 1 
(though not of course greater than 1). In fact, it is easily realized that it is 
precisely this possibility that is responsible for the type of paradox represented 
by the so-called Petersburg problem (cf. [7], p. 184 and pp. 220-222). 

This possibility is just a manifestation of the fact that, whether a measure 
or probability be relative or not, Fatou’s theorem in Lebesgue’s theory is an 
inequality which cannot in general be replaced by an equality. Correspond- 
ingly, even if F(n) has an asymptotic distribution function, and even if there 


exists an asymptotic mean 


(1) M(F) = lim 3 F(n)/m, 


n=1 


all that can be said is that the first moment of the asymptotic distribution 
function, 


Ad 


co @) 
(2) ada(x), i.e., kax(o), 
-00 k=0 


cannot be less than the value (1). However, if the function F’(n) is almost- 
periodic (B), then not only do both the mean and the asymptotic distribution 
function exist but, in addition, the mean is equal to the first moment, 


* Received March 14, 1945. 


ce 
), 
n 
7 
m 
223 


224 - AUREL WINTNER. 


(3) M(F) = Kean ( 00). 
k=0 
Cf. [2], pp. 747-749. 


2. With reference to an algebraic number field R =R(6), let F(n) 
= F(n;8) denote the number of representations of a positive integer n as 
norms of integral ideals of R. In other words, F'(n) is the coefficient of 1/n* 
in the Dirichlet series of Dedekind’s £(s) = £(s;8), 


(4) (1— (Np)-*)”, 


where s is greater than 1 and p runs through all prime ideals of ®. If & is 
the rational field, then 


(5) = 3 F(n)/n 


becomes Riemann’s zeta-function, i. e., /(m) =1 holds for all n. But it turns 
out that, except when & is the rational field, almost all positive integers n 
cannot be represented as norms of (integral) ideals of , i. e., all but o(m) 
of the first m terms of the Dedekind zeta-series of R are missing. 


In the above terminology, this can be expressed by saying that F'(n) 
possesses an asymptotic distribution function, a(z), since the asymptotic 
probabilities 7(0) of the various states exist and are such as to make 


(6) hence 7,(0) =2,(0) =---=0, 


except when & is the rational field. In this exceptional case, (6) must of 
course be replaced by 


(0) hence m(o)—0O and 22(0) =---=—0, 


since 7,(z) is 1 for every x(= 1) in this case. 

According to Dirichlet-Dedekind (cf., e. g., [3], p. 230), the mean (1) 
of F(n) exists for every R. In addition, 
(7) M(F) +0, 


since (1) is the residue (at s 1) of the function (4), which has (at s = 1) 
a simple pole. Finally, 


(8) F(n) =0,1,2,---, hence FZO. 


of 


THE VALUES OF THE NORMS IN ALGEBRAIC NUMBER FIELDS. 225 


But the content of the assertion (6) is that “ (mn) is 0 almost all of the time ”. 
It follows therefore from (8), (7) and (1) that “in the mean, F(n) is very 
large when it is not 0”. 


3. During the last decade, various classical results concerning the exist- 
ence of mean-values of arithmetical functions, and of remainder terms of 
“explicit sum formulae” in the analytic theory of numbers, have been replaced 
by results establishing the almost-periodicity (B) of these functions. Such 
results imply the corresponding classical results, since the existence of a mean- 
value is necessary for almost-periodicity (B). On the other hand, it is easy 
to give examples of functions F(n), even of non-negative functions, which 
are not almost-periodic (B), although. the mean-value (1) exists. However, 
such examples usually depend on artificial constructions and have therefore no 
arithmetical significance. 

Thus, in order to show that the improvements of the classical results in 
question are not without arithmetical relevance, it is desirable to exhibit a 
class of functions of arithmetical significance which are not almost periodic (B), 
although their mean-values exist. And it is easy to conclude that the above 
functions F'(n) are of this type, except when & is the rational field. 

In fact, let us suppose that /’(n) is almost-periodic (B). Then (3) is 
applicable. But the term belonging to k = 0 in (3) is 0, since the k-th term 
of (3) contains the factor k. It follows therefore from (6) that the expression 
(3) is 0. But this contradicts (7). 


Remark. The assertion (6) supplies, for every fixed /, limiting values, 
(0), for the probabilities as m—> 0. There arises the finer ques- 
tion as to the asymptotic behavior of the error terms (oo) —-7(m) as 
m—> ©, where k is arbitrarily fixed. The answer to this question is known in 
case R is Gauss’ quadratic field, R(%); cf. [8], pp. 61-66. Actually, the 
method applied in [8], pp. 61-66, to the Gaussian field could be transferred 
to the case of every algebraic number field (excepting the trivial case of the 
rational field), if use is made of those asymptotic results concerning factoriza- 
tions in ® which were announced (without proof; cf. pp. 263-265 and p. 537 
of Hilbert’s Zahlbericht [3]) by Kronecker [5] and subsequently proved, in 
refined formulations, by Frobenius [1]. However, this procedure would involve 
the same analytical machinery on which the Prime Number Theorem depends. 
In fact, it would involve extensions of Ikehara’s theorem; cf. [8], p. 65 and 
p. 66. On the other hand, the proof of (6) itself will be “elementary” (in 
every sense customary in the analytic theory of numbers). 


) 
) 
) 
ic 
) 
) 


226 AUREL WINTNER. 


4. With reference to a fixed field & and to every rational prime p, let 
J =j(p) =1 denote the number of the distinct prime ideals dividing the 
principal ideal [p], and let =9:(p) 21,- gj; =g;(p) 21 be the 
respective degrees of these j = j(p) prime ideals. Then (4) can be rearranged 
into 
(9) f(s) == J] (1— (1 — 


where s > 1. Since F(n) may be defined by identifying (9) with (5), it 
follows that 


(10) F(mnz) = F(m)F(nm2) if (m,n) =1 
(i. e., if m; and mz are relatively prime). 


If —e,(p) = 1,-- =e;(p) 21 denote the respective multi- 
plicities of the 7 —j(p) distinct prime ideals occurring in the factorization 
of the principal ideal [p], then the sum eg: +- - -+ ejg; is independent 
of p, since it is the degree of &. But it is known that the factorization of [p] 
into prime ideals contains a multiple factor if and only if the rational prime p 
which defines [p] is a divisor of the discriminant of 8. In particular, every 
€x = €n(p) is 1 as soon as p® exceeds the discriminant of . Consequently, 
the sum of the 7 —j(p) positive integers gx = gn(p) is the degree of ® as 
soon as p is large enough. 

Since (9) is identical with (5), it follows, in particular, that 


(11) F(p*) S const. 


holds for all rational primes p, for all positive integers 7 and for a certain 
constant determined by & alone. 
It also follows that the difference 
1 


12 > 
( ) N<e Np pcx 


tends to a finite limit as x—> «. In order to see this, it is sufficient to identify 
both (5) and (9) with (4), and to observe that 


const. const. 

m < > ae < 0. 
p m=2 P p 


5. A function F(n) of the positive integer n is called multiplicative if 


it satisfies condition (10). 


THE VALUES OF THE NORMS IN ALGEBRAIC NUMBER FIELDS. 227 


If S is a set of distinct positive integers, its characteristic function, 
S(n), is defined to be the function which is 1 or 0 according as n is or is 
not in S. A set S having a multiplicative characteristic function is called a 
multiplicative set. 

If F(n) is any multiplicative function, and if Sr denotes the set of 
those positive integers n at which F(n) is distinct from 0, then Sr is a 
multiplicative set. In order to see this, it is sufficient to compare condition 
(10) with the definition of a multiplicative set. In fact, it is clear that a set 


‘ § is a multiplicative set if, and only if, it has the following property: The 


product of two integers which are relatively prime is in S or is not in 8 
according as both factors are in S or at least one of them is not in S. 

With reference to any multiplicative set S, let Sm denote the number of 
those of its elements which do not exceed m. Then the ratio Sm/m tends to a 
limit as m— oo, and the value of this limit is 0 or positive (but of course 
not greater than 1) according as the sum of the reciprocal values of those 
prime numbers which are not contained in S is divergent or convergent. This 
is an immediate consequence of the sieve-process of Eratosthenes; cf., e. g., 
[8], p. 68. 

Since the coefficients of (5) determine a multiplicative function of n, 
the corresponding set F’s, the n-set on which /’(n) does not vanish, is a multi- 
plicative set, and so the ratio Sm/m will tend to 0 or to a positive limit 
according as the sum of the reciprocal values of those primes p for which 
F(p) does not vanish is divergent or convergent. It follows therefore by 
subtraction, that (6) is equivalent to the following statement: The sum of 
the reciprocal values of all primes p satisfying = 0 is divergent, 


1 
13 - =o. 
(13) 


Accordingly, everything will be proved if it is verified that (13) is true 
for every algebraic number field distinct from the rational field. (In the case 
of the rational field, the sum (13), instead of being divergent, is vacuous, 


i.e., 0.) 


6. According to Mertens’ elementary approximation to the Prime 
Number Theorem, the difference 


14 x — —loglogz 
(14) 


t 

) 
y 
n 
y 
if 


228 AUREL WINTNER. 


tends to a finite limit as x—> oo. And all the known proofs of this fact (cf., 
e.g., [4], pp. 22-24) apply, without any change, to the case of any algebraic 
number field; in the sense that the difference 


(15) = —loglogz 


tends, as r—> ©, to a finite limit in every & (for a detailed proof, cf. [6], 
pp. 150-151). 


Since all three differences (12), (14), (15) tend to finite limits as 
x —> oo, the same is true of the difference 


Fy) 
Pp p<aP 
This means that the infinite series 
(16) 
P 


in which p runs through the (monotone) sequence of all rational primes p, 
is convergent. But every F'(p) is a noa-negative integer. Hence, the summa- 
tion in (13) runs over those primes p for which the corresponding term of the 
series (16) becomes negative. It follows, therefore, from the convergence of 
(16) that (13) must be true unless the complementary series, i. e., the series 


F(p) —1 
(17) 
F(p) >0 


is convergent. Consequently, it is sufficient to verify the divergence of the 
series (17). 

Since every F(p) is an integer, every F(p) —1 occurring in (17) is 
either 0 or not less than 1. Hence, in order to assure the divergence of the 
series (17), it is sufficient to ascertain that 


== 0, 


1 
18 > — 
Fip)>1 P 


But the assertions of Kronecker [5], verified by Frobenius [1], imply the 
truth of (18) for every algebraic number field &, except for the rational field. 


Ture JoHNS HOPKINS UNIVERSITY. 


[1] 


THE VALUES OF THE NORMS IN ALGEBRAIC NUMBER FIELDS. 229 


REFERENCES. 


G. Frobenius, “ Ueber Beziehungen zwischen den Primidealen eines algebraischen 
Kérpers und den Substitutionen seiner Gruppe,” Sitzungsberichte der 
Akademie der Wissenschaften zu Berlin, 1896, pp. 689-703. 

P. Hartman and A. Wintner, “ On the standard deviations of additive arithmetical 
functions,” American Journal of Mathematics, vol. 62 (1940), pp. 743-752. 

D. Hilbert, “ Zahlbericht,” Jahresbericht der Deutschen Mathematiker-Vereinigung, 
vol. 4 (1894-95 [1897]), pp. 177-546. 

A. E. Ingham, J'he Distribution of Prime Numbers, Cambridge Tracts No. 30, 1932. 

L. Kronecker, Werke, vol. 2, pp. 85-93. 

E. Landau, “Ueber die zu einem algebraischen Zahlkérper gehérige Zetafunktion 
und die Ausdehnung der Tschebyschefschen Primzahltheorie auf das Problem 
der Verteilung der Primideale,” Journal fiir die reine und angewandie Mathe- 
matik, vol. 125 (1903), pp. 64-188. 

I. Todhunter, A History of the Mathematical Theory of Probability, Cambridge and 
London, 1865. 

A. Wintner, Hratosthenian Averages, Baltimore, 1943. 


[2] 
[3] 
[4] 
[5] 
[6] 
[7] 
[8] 
7 
f 
S 
e 
is 
e 
ie 


THE ASYMPTOTIC NUMBER OF LATIN RECTANGLES.* 


By Erpés and Irvine KapLansky 


1, Introduction. The problem of enumerating n by k Latin rectangles 
was solved formally by MacMahon [4] using his operational methods. For 
k = 3, more explicit solutions have been given in [1], [2], [3], and [5]. 
While further exact enumeration seems difficult, it is an easy heuristic 
conjecture that the number of n by & Latin rectangles is asymptotic to 
(n!)* exp (—xC.2). Because of an error, Jacob [2] was led to deny this con- 
jecture for k = 3; but Kerawala [3] rectified the error and then verified the 
conjecture to a high degree of approximation. The first proof for k=3 
appears to have been given by Riordan [5]. 

In this paper we shall prove the conjecture not only for k& fixed (as 
n— oo) but for k < (log n)*/**. As indicated below, a considerably shorter 
proof could be given for the former case. The additional detail is perhaps 
justified by (1) the interest attached to an approach to Latin squares (k —n), 
(2) the emergence of further terms of an asymptotic series (4), (3) the fact 
that (log n)*/? appears to be a “natural boundary” of the method. (We 
believe however that the actual break occurs at k = n’/*,) 


2. Notation. An n by & Latin rectangle LZ is an array of n rows and k 
columns, with the integers 1,- - -, in each row and all distinct integers in 
each column. Let NW be the number of ways of adding a (k + 1)-st row to L 
so as to make the augmented array a Latin rectangle. We use the sieve method 
(method of inclusion and exclusion) to obtain an expression for N. From n}, 
the total number of possible choices for the (&k + 1)-st row, we take away 
those having a clash with Z in a given column—summed over all choices of 
that column, thén reinstate those having clashes in two given columns, ete. 
The result can be written 

n 
(1) N=} (—)’Ar(n—r)! 

r=0 
where A, is the number of ways of choosing r distinct integers in L, no two 
in the same column. In particular 49 = 1, Ai = nk. To estimate the higher 
values of A, we apply the sieve method again. The total number of ways of 


* Received November 30, 1945. 


230 


‘ 


THE ASYMPTOTIC NUMBER OF LATIN RECTANGLES. 231 


selecting r elements of L, not necessarily distinct integers but with no two in 
the same column, is »C'-k". This over-estimates A;; we have to take away those 


selections which include a specified pair of 1’s, 2’s,- - -, or n’s, then reinstate 
those which include two pairs, etc. We may write the result 


(2) Ar => (—)*B(r, 


Here B(r,s) is precisely defined as follows. Take any s of the nxC. pairs of 
1’s,- - +, s which can be formed in Z. Suppose that this selection involves in 
all y elements; y may be as large as 2s, or as small as the integer for which 
yV, = 8. Find the number of ways of adjoining r— y further elements, so as 
to form a set of r elements with no two in the same column. The result of 
summing over all choices of s pairs is, by definition, B(r,s). We note in 
particular that 


(3) B(r,0) 
(4) B(r, 1) = 2 n-2Cr-2 


The B’s may be analyzed further as follows. Let F(s,¢) be the number 
of ways of choosing s pairs of 1’s,: - -,n’s, which use up ¢ elements in all, 
and for which no two of the ¢ elements lie in the same column. The number 
of ways of expanding this selection of ¢ elements to r elements, with no two 
in the same column, is n-:Cr_-tk”™*. Hence 


(5) B(r, 8) => F(s, t) 
t 


It is to be observed that extreme limits for the summation in (5) are given by 
tS 2s and s = :C2 or, more generously, Vs = f. 

These quantities F(s,¢) are the ultimate building blocks from which the 
exact value of V is constructed. We shall discuss them further in 4. For the 
present the following crude inequality will suffice: 


(6) > F(s, nt/2 (kt) ®, 


The proof of (6) is as follows. The left hand side is just the number of ways 
of choosing a set of (any number of) pairs which involve in all precisely ¢ 
elements. In such a choice at most [¢/2] distinct integers are permissible, 


> 


and these may be taken in less than n‘/* ways. In all we have at. most 


s 
3 
3; 
r 
t 
e 
I; 
n 
d 
y 
f 
8 
of 
4 


232 PAUL ERDOS AND IRVING KAPLANSKY. 


iC, < t? pairs to dispose of in the selection. For each of these ¢? pairs we 
have ,C,t/2 < k*t possibilities and hence for all of them at most (k?t)# 
choices. This establishes (6). 

The various quantities defined in this section will be used without further 
explanation in the remainder of the paper. 

3. Proof of the main result. We first prove 

THEOREM 1. If k < log n)*/**, then for sufficiently large n 


(7) | Net/n!—1| <n 


where c is a positive constant depending only on e. 


Proof. Define A(r,x) by 


(8) A(r,2) (—)'B(r,8), 


where x = [(logn)**]. Then by the sieve’s well known property of being 


alternately in excess and defect we have 
(9) | Ar — B(r, 0) — A(r,z)| S B(r, 2). 
In (1) make the substitution 
A, = {A,— B(r,0) — A(r,r)} + B(r, 0) + 2) 


and use (3) and (9). We find 


where 

(11) G= (—)"A(r, z)(n—r)!, 

(12) H = >B(r,2)(n—r) 1. 


We proceed to study G. With the use of (8) and (5), and an interchange 
of summation signs, (11) becomes 


= 


THE ASYMPTOTIC NUMBER OF LATIN RECTANGLES. 


t r=t 


where (n)+=n(n—1)-+++(n—r-+1) is the Jordan factorial notation. 
The change of variable r —¢-+- wu transforms the final sum into 


n-t 
(—)*/(n)t (—k)*/u! = (—) *(e*— 6) /(n)t 
u=-0 
where @ is the remainder after n —t terms of the series for e*. Then 


(13) 


@-1 


EF (s,t) (1+ 66) /(n)e 


g=1 


As noted above, the limits for ¢ lie between Vs and 2s. Hence tS 22 < 2logn. 
From this we readily deduce 


(14) 1/(n)t < 
(15) < C2, 


where ¢;,¢2 are absolute constants. From (6), (13), (14), and (15) we 
obtain 


22 
| G | ! Cz (k7t) nt 
t=1 


with cs; = c¢,(1-+ c2). In the fraction under the summation sign, the loga- 


rithms of numerator and denominator are respectively of the orders ¢? log log n 
and tlogn. Since ¢ < 2(logn)**, it follows that for large n 


where ¢, is a positive constant depending only on «. Hence 


(16) |G 


ek&/n! < < 


We next turn our attention to the term H given by (12). From (5) 
and an interchange of orders of summation, 


H/n! = F(x, t) 
t 


r= 


The final sum is the product of 1/(n)¢ by a portion of the series for e*. 
Hence 


233 


234 PAUL ERDOS AND IRVING KAPLANSKY. 


H/n! < & > F(a, t)/(n)t < (kt) ?/nt? 


by (6) and (14). The fraction to be estimated is the same as above but 
the summation now starts at Vr = ce(logn) It follows that ¢ log n 
= ¢e(log n)*/?-*/?, and we are able to swallow up a further term e** whose 


logarithm is less than 2(log n)*/**. Hence for large n 


and 


(17) He¥/n! < < 


Combining (16), (17), and (10), we obtain (7), for the sum on the left of 
(10) may run to infinity at a cost of O(n). This concludes the proof. 
(We may note that for the case where k is fixed as n—> ©, the proof 
could be abridged as follows. We take x 1; then the term G disappears, 
and an estimate of H is easily obtained from (4).) 
From Theorem 1 we readily derive our main result : 


THEOREM 2. Let f(n,k) be the number of n by k Latin rectangles and 
suppose k < (log n)*/**. Then 


(18) f(n,k)(n!)* exp GC2) ~1 as noo. 


Proof. From Theorem 1 it follows that f(n,i+ 1) lies between the 
limits f(n,i)n!e*(1+ n°). Taking the product from i—1 to k —1, we 
find that f(n, &) lies between the limits 


(n!)* exp (—xC2) (1 + 
Since (1+ n~°)* and 1 as ©, we obtain (18). 


4. Further terms of the asymptotic series. A more careful argument 
reveals that the error term in (7) is actually of the order of k?n~*. By detach- 
ing the term B(r,1) as well as B(r,0) in (2), we can reduce the error to 
the order of k*n-*. Continuing in this fashion, we may compute successive 
terms of an asymptotic series. The existence of such a series was conjectured 
by Jacob [2, 337]. | 

We shall merely sketch the results. Applying (1), (2), and (5) as we 
did in 3, we find 


N/n! (—)*E F(s, t) (e*—0)/(n)t. 


nd 


he 


we 


THE ASYMPTOTIC NUMBER OF LATIN RECTANGLES. 


The term 6 may be dropped and we have 


F(1, 2) F(2, 3) F'(2, 4) 
(19) Neé/n! =1— + + 
Thus all that is required is evaluation of the F’s. That F(1,2) =nxC2z was 
already implicitly noted in (4). For F(2,3) we observe that not more than 


one integer may be used, that there are then nC; choices for the three ele- 
ments, and 3 choices for the two pairs within them. Hence F'(2, 3) = 3nxC3. 
Similarly /(2,4) includes the term 3n,C,, corresponding to the choice of 
only one integer. If two different integers are taken, there are ab initio 
nU2(xC2)* choices; but we must eliminate selections which include two ele- 
ments in the same column. An application of the sieve process to this last 
difficulty yields 


P(2,4) = + n€2(xC 2)? — (ke — 1)* + Z, 


where 1 is the number of instances in which integers 7,7 both occur in two 

different columns. It is noteworthy that this is the first term which depends 

upon the particular Latin rectangle to which a (k + 1)-st row is being added. 
A simple argument shows that X S n;,C.(k—1), so that X/(n)4q is of 

order n~“ or less, as are all the later terms of (19). Hence we have, correct 

up to 

(2): 


= 1 + + 4) (8h —7)/12n? 


By taking the product of the terms (20) from 1 to k—1, we obtain the 
asymptotic series for f(n, k), the number of Latin rectangles: 


(21) f(n, k) (n!)* exp («C2) 


3k? + 8k — 30)/12n? 


For k = 3, the right side of (21) becomes 1— 1/n—1/2n?-+---. In the 
table below we compare this with the exact value given by Kerawala in [3]. 


n 1 1/n — 1/2n? Exact value 
of (21) 

5 .78 76995 
10 895 .89560 
15 93111 .93126 
20 94875 .94881 


9592 95923 


230 

ut 
rn 

of 

| 
= 
nt 

to § 
ive 
ed / 
we 

25 


236 PAUL ERDOS AND IRVING KAPLANSKY. 


In attempting to push the asymptotic series still further, we run into 
the difficulty that terms like X, i.e., terms dependent upon the preceding 
Latin rectangle, begin to play a réle in (20). However, it may be that in 
(21) at least the term in n~* can be obtained without consideration of X, for 
heuristically it seems likely that the “expectation” of X is o(n). 

In conclusion we remark that the form of (21) strongly suggests that at 
about 'k = n*/* the expression ceases to be valid. We are unable to prove this 


rigorously. 


STANFORD UNIVERSITY 
AND 
UNIVERSITY OF CHICAGO. 


BIBLIOGRAPHY 


1. L. Dulmage, Problem E 650, American Mathematical Monthly, vol. 51 (1944), 
pp. 586-587. 

2. S. M. Jacob, “The enumeration of the Latin rectangle of depth three .. .”, 
Proceedings of the London Mathematical Society, vol. 31 (1930), pp. 329-354. 

3. S. M. Kerawala, “The enumeration of the Latin rectangle of depth three by 
means of a difference equation,” Bulletin of the Calcutta Mathematical Society, vol. 33 
(1941), pp. 119-127. 

4, P. A. MacMahon, Combinatory Analysis, vol. I, Cambridge, 1915, pp. 246-263. 

5. J. Riordan, “ Three-line Latin rectangles,’ American Mathematical Monthly, 


vol. 51 (1944), pp. 450-452. 


if 
; 


os 


A MATRIX DIFFERENTIAL EQUATION OF RICCATI TYPE.* 


By T. Rem 


1, Introduction. This paper is concerned with the matrix differential 
equation 


(1.1) W’ + WA(z) + D(z) W + WB(z)W=C(z), 


where A(x), B(x), C(x) and D(z) are given n X n square matrices whose 
elements are continuous functions? of the real variable x on the finite and 
closed interval ab:as¢=b. Section 2 of this paper contains generaliza- 
tions of some well known theorems on the solutions of a single ordinary 
differential equation of Riccati type. In particular, Theorems 2.2 and 2.3 
provide decided extensions of results proved by Whyburn [6]? for the matrix 
differential equation 


(1. 2) W+WW=C(z). 


Section 3 is concerned with the relation of the matrix equation (1.1) to a 
system of 2n linear homogeneous differential equations. Finally, in Section 4 
it is shown that the analogue of Legendre’s differential equation for simple 
integral problems of the calculus of variations in (n + 1)-space is a matrix 
differential equation of the form (1.1). 

Throughout this paper capital italic letters denote n-rowed square matrices. 
In particular, n X 1 rectangular matrices are referred to as vectors. The 
transpose of a matrix A is indicated by A, the reciprocal of a non-singular 
matrix A by A~, and, if the elements of A are differentiable functions, the 


matrix of derivatives is denoted by A’. 


2. Solutions of the matrix differential equation (1.1). The follow- 
ing preliminary result of Lemma 2.1 has been noted previously by the 
author [5]. 


LemMA 2.1. Jf the elements of the matrices H(x), K(x) are con- 


* Received August 30, 1945. Presented to the Society, November 25, 1944. 

1One might allow the elements of these matrices to be merely Lebesgue integrable 
on ab; in this case, in (1.1) and the subsequently considered matrix differential 
equations it is understood that a solution is a matrix whose elements are absolutely 
continuous and which satisfies the gifferential equation almost everywhere on ab. 

Numbers in square brackets refer to the bibliography at the end of this paper. 


237 


ng 
in 
or 
at 
‘ig 


238 WILLIAM T. REID. 


tinuous on ab, then the general solution T(x) of the matrix differential 
equation 


(2.1) = H(2#)T + TK(z) 
is of the form T(x) =T(x)CT2(x), where T;(x) ts a non-singular (funda- 


mental) solution of T’; = H(x)T,, T:(x) is a non-singular solution of 
T’, =T2K (zx), and C is an arbitrary constant matric. 


CoRoLLARY 1. A solution T(x) of (2.1) is of constant rank on ab. 


Corottary 2. Jf Ti(x),- 1SkSn, are solutions of 
(2.1), then there exist constants ¢1,° * *, Cen: such that on ab the matrix 
kn+1 
(2. 2) P(r) = cpT'p(x) 


has constant rank not exceeding n— k. 


The result of Corollary 2 may be obtained, for example, by choosing the 
constants cg so that for x =a the elements of & rows of the matrix (2.2) are 
equal to zero. 

Now if 7,(x) is a non-singular solution of 7’; = H(x)7T,, then T2(z) 
=T,"(z) is a non-singular solution of 7’, The following 
result is then an immediate consequence of Lemma 2.1, and the similarity of 
T,(2)CT,*(x) to a constant matrix. 


CoroLitAry 3. The general solution of the matrix differential equation 
(2. 3) T’ = H(z)T —TH (cz) 


is of the form T(x) =T7,(2)CT," (x), where T;(x) is a non-singular solu- 
tion of T’; = H(x)T;,, and C is an arbitrary constant matrix. For a given 
solution T(x) of (2.3) the coefficients of the characteristic equation of T(x) 


are constants. 


In the following theorems we shall be concerned with solutions W(2) 
of the non-linear matrix differential equation (1.1) for which the elements of 
W(z) are continuous on ab; such solutions will be referred to as continuous 


solutions. 


THEOREM 2.1. Jf Wi(x) and W.(x) are two continuous solutions of 
(1.1) on ab, then V(x) = W2(x) — W(x) is a continuous solution of the 


matrix differential equation 


A MATRIX DIFFERENTIAL EQUATION OF RICCATI TYPE. 239 
(2. 4) V’ + V[A(x) + B(x) Wi, (2) ] 
+ [D(x) + + VB(z)V =0 
on ab; moreover, V(x) is of constant rank on this interval. 


It may be verified directly that V = W.— W, satisfies (2.4). Now 
(2.4) may be written as an equation in V of the form (2.1) with 


H(x) =—[D(a) + Wi(2) B(x) + V(x) B(z)], 
K (x2) =— + 


and hence V(x) is of constant rank on ab by the above Corollary 1. 


THEOREM 2.2. If Vi(x) and V2(x) are non-singular continuous solu- 
tions of the matrix differential equation 


(2. 5) V’ + VA(x) + D(z) V + =0 

on ab, then Vo"(x) —Vy"(x) has constant rank on this interval. Moreover, 
if (2), lL SkSn, are non-singular continuous solutions 
of (2.5), then for each value of «, (a=—1,---+-,kn-+2), there exist con- 


stants cpa, (B kn +2; Ba), such that on ab the matrix 


kn+2 
(2. 6) cpalVa*(x) — Vg*(z) ] 


has constant rank not exceeding n— k. 


If V(x) is a non-singular continuous solution of (2.5) on ab, then 


U(x) = V(x) is a solution of the non-homogeneous equation 
(2. 7) U’ = A(r)U + UD(z) + B(z) 


on this interval. Consequently, if Vi(z) and V2(#) are non-singular con- 
tinuous solutions of (2.5) on ab, then T(x) = V2"*(x) —Vi(zx) is a 


solution of the homogeneous equation 
(2. 8) T’ =A(zr)T+TD(z), 


and hence, by Corollary 1, the matrix V2-'(2) — V,1(2) is of constant rank 
on ab. Correspondingly, if Vins 1 Sk Sn, are non-singular 
continuous solutions of (2.5), then for a fixed value of « the matrices 
= (x) — +2; arekn+1 
solutions of (2.8), and by Corollary 2 there exist constants cg such that on 
ab the matrix (2.6) has constant rank not exceeding n — k. 


> 


240 WILLIAM T. REID. 


THEOREM 2.3. Suppose that 
We(z), +8; 15kSn), 


are continuous solutions of (1.1) on ab, and that for a given index y the 
kn + 2 matrices Wg(x) —W.(x), (8 =1,-++,kn +3; B¥y), are non- 
singular. Then for each value of a («=1,:--:,kn+3; at4y), there 
exist constants Cg.a;y such that on ab the two matrices 


(2.9) Wa] *[Ws— We] 
=i =1 
Ba; 


have the same constant rank, which does not exceed n—k. 
In view of Theorem 2.1, each of the matrices 
Va(x) = We(x) — W(x), +3; 
is a non-singular continuous solution of 


(2. 10) V’ + V[A(2) + B(x) W,(z)] 
+ [D(z) + W,(r)B(2)]V + VB(z)V =0. 


The application of Theorem 2.2 to these solutions Vg(x) of (2.10) then 
implies that for a given ay there exist constants cga = Cg,a;+ such that 


on ab the matrix 
kn+3 


(2. 11) thes [ Ve" — 


has constant rank not exceeding n—k. The fact that the first matrix of 
(2.9) has the same rank as (2.11) is an immediate consequence of the 
relations 


Vat — Vg? = Va" [ Vp— Val Ve? = [Wa— Wy]? [We— Wa] 
Similarly, the relations 
Vat — Vg = Vg" [Vp— Va] Va? = [We— Wy]7*[We— Wa] [Wa— W,]7 


imply that the second matrix of (2.9) has the same rank as the matrix (2.11). 

It is to be remarked that the corresponding result obtained by Whyburn 
[6] for the equation (1.2) consisted only of the above result for the second 
matrix (2.9) in the special case k =n. It is to be noted, moreover, that if 


the elements of the matrices A(z), B(x), C(x) and D(z) are real-valued. 


B=1 


A MATRIX DIFFERENTIAL EQUATION OF RICCATI TYPE. 241 


and the elements of the solutions Wg(z) are also real-valued, then the constants 
Cp.a;7 Of the above theorem may be chosen as real. 

In order to derive further results on the form of the general solution of 
(1.1), suppose that W,(a) is a continuous solution of this equation on the 
interval ab, and let T(x) = T(z), (p +2), be n? solutions of 
the homogeneous equation 


(2.12) =[A(x) + B(x) ]T + T[D(a) + Wi(2)B(2)] 


which are linearly independent on ab. In view of Lemma 2.1 and its Corol- 
lary 1, such matrices may be determined by choosing the initial values at zo 
so that the elements of the n* matrices T'p(xo) are linearly independent. Now 
let U2(x) be a solution of the corresponding non-homogeneous equation 


(2.13) U’,=[A(x) + Wi(z)]U2+ U2[D(r) + Wi(z) B(x)] + B(z) 
which is such that each of the n* + 1 matrices 
U(x), Up(z) —U2(2) + T(z), (p= 3,- n? +2), 


is non-singular on ab. Such a choice is clearly possible, in view of the result 
of Lemma 2.1 for the equation (2.12). Then W,(z), together with the 
n*-+-1 matrices 


(2. 14) Wa(x) = Wi(z) + Ug (zx), (8B +2), 
afford n? + 2 continuous solutions of (1.1) such that the n?-+ 1 matrices 
Ve(x) = Wa(x) — -,n? +2), 
are non-singular on ab. As 
Vei(z) =Ug(z), (8B =2,- -,n? +2), 
from the above construction we have that the matrices 
— =—Tp(z), (p= 3,- n? +2), 


are linearly independent on ab. Now suppose that W(x) is any continuous 
solution of (1.1) on ab which is such that the matrix W(x) — W,(z) is non- 
singular. If in Theorem 2.3 we consider k =n, y —1, a= 2, and identify 
W(x) with W,*,,(x), it then follows that the coefficient cn2,3,.2:; in (2. 9) 
cannot be zero. Consequently, we have the following result as a corollary to 
Theorem 2. 3. 


242 WILLIAM T. REID. 


Corottary. If W(x) is a continuous solution of (1.1) on the interval 
ab the matrices Wg(x), (8 =2,° +2), determined by (2.14) are 
additional continuous solutions of (1.1) such that, if W(a) is an arbitrary 
continuous solution of this equation on ab for which W(x) — W,(x) ts non- 
singular, then there exist constants cp, (p + +2), such that on 
this interval, 


(2.15,) [W—W.][W—W.]2 = ep[Wp— We] [Wp — Wi], 
p=3 


(2.15,) [W—W.]“[W—W.] co[Wp— W.]“[Wp— We]. 
p=3 


3. An associated system of linear differential equations. We shall 
now consider the relation between the matrix equation (1.1) and the system 
of 2n linear homogeneous differential equations which may be written in vector 


form as 
(3. 1) =A(r)n+ C(r)yn— 


In (3.1), » and €¢ are vectors, that is, n X 1 matrices, with components (7) 
and (¢;). The fundamental connection between (3.1) and (1.1) is given by 
the following theorem. 


THEOREM 3.1. There exists a set of n solutions 
(j=1,---,n), 


of (3.1) such that the matrix || ij(x) || is non-singular on ab if and only if 
(1.1) possesses a continuous solution on ab. 


For suppose that W(z) is a continuous solution of (1.1) on ab, and 
consider the linear homogeneous matrix differential equation 


(3. 2) Y’ =[A(c) + B(x) W(z)]¥. 


Let Y(x) =| mij(z) || be a non-singular solution of (3.2), and define 
Z(x) (x) || as Z(x@) = (x). From (3.2) and (1.1) it then 
follows that 

(3. 3) Y’=A(zr)Y + B(zr)Z, Z’ =C(r2)¥ — D(z)Z, 


and i =7ij(@), & = &ij(@), (7 =1,- -,n), define n solutions of (3.1) 


such that the matrix || i;(z) || is non-singular on ab. 
On the other hand, if ni =j(7), (7 are 
solutions of (3.1) such that || 7i;(7)|| is non-singular on ab, then 


n7+2 
n*+2 
- 


by 


if 


ind 


fine 
hen 


are 


A MATRIX DIFFERENTIAL EQUATION OF RICCATI TYPE. 


satisfy (3.2), and it is verified readily that W(a) =Z(x)Y"(2) is a con- 
tinuous solution of (1.1) on ab. 


4, The Legendre differential equation of the calculus of variations. 
For a non-parametric fixed end point problem of the calculus of variations 
the second variation is of the form 


4,1) ify) — 9, de 
J + 24 + (x) de, 


where R(x), Q(x) and P(x) are n X m square matrices whose elements are 
real-valued and continuous on 2,22, 


while R(x) and P(x) are symmetric on 
this interval. In (4.1), 7 is an admissible variation, that is, an n-dimensional 
vector whose components 7; (2) are continuous and have piece-wise continuous 
derivatives on 222, while = 0 = ni (22), 

Along a non-singular extremal £;. the matrix R(x) is non-singular, and 
the canonical form of the accessory (Jacobi) differential equations is 


where the coefficients of (4.2) are given by 
(4. 3) A=—R9Q, C=P—QR 9. 


That is, the canonical form of the accessory equations is a system (3.1) with 
D(z) = A(x), and B(x), C(x) symmetric matrices. 

Now if /,. is a non-singular extremal which has on it no point conjugate 
to the point 1, there exists a solution Y (x) = || |], = || 
of the system 


(4. 4) Y’ = A(z) Y + B(z)Z, Z’ = C(x) —A(2z)Z 


such that Y (x) is non-singular on 2,22, while the matrix Y (x)Z(z) is sym- 
metric on this interval (see, for example, Bliss [3], Secs. 12, 36, 39). In view 
of the non-singularity of Y (x), the symmetry of Y(x)Z(a) is equivalent to 
the symmetry of the matrix Z(x) ¥*"(x). The matrix differential equation of 
Riccati type associated with (4. 4) is 


(4.5) W’ + WA(x) + A(x)W + WB(x)W —C(zr) =0, 


243 
| 
m 
or 
i) 
1) 


244 WILLIAM T. REID. 


which, in view of (4. 3), may be written in terms of the coefficients of (4.1) as 
(4. 5’) W’ + — W]R>(2)[Q(x) — W]—P(z) =0. 


This matrix differential equation may properly be termed the Legendre 
differential equation for (4.1), since it is the direct generalization of the 
equation introduced by Legendre (see Bolza [4], Sec. 9) for the special case 
n==1. From the symmetry of the matrices B(x) and C(x) it follows, in 
particular, that if W(a) is a solution of (4.5) then W (2) is also a solution 
of this equation. Consequently, if W(x) is a solution of (4.5) such that at a 
particular point z) the matrix W(z,) is symmetric, then W(x) is symmetric 
for all values of z. In view of Theorem 3.1 we then have that if a non- 
singular extremal 2 has on it no point conjugate to the point 1, then there 
exists on 2,22 a continuous symmetric solution W(z) of the Legendre differ- 
ential equation (4.5). For such a solution W(2) we have the following 
identity 


(4. 6) 2w (2, n, 7’) + (gW(zr)n)’, 


where u—7 + R*(Q—W)y. One could ‘use the identity (4.6) to show 
that if ZH, is a non-singular extremal along which J2[y] = 0 for arbitrary 
admissible variations y, then R(x) is positive semi-definite, and consequently 
positive definite on T1023. Such a procedure would not be desirable, however, 
since without the initial assumption of non-singularity one may show readily 
that if J.[y] 20 for arbitrary admissible variations then R(x) is positive 
semi-definite on 2,22. In view of the relation between a continuous symmetric 
solution of (4.5) and a solution Y (xr), Z(x) of (4.4), the above identity 
(4.6) is equivalent to the well known Clebsch (Legendre) transformation of 
the second vuriation. In particular, one may state that along a non-singular 
extremal /,. the condition that I2[y] > 0 for arbitrary non-identically vanish- 
ing admissible variations 7 is equivalent to the condition that R(x) be positive 
definite on 2,22 and there exists a continuous symmetric solution of (4.5) on 
this interval. 

The above relations between the canonical form of the accessory differ- 
ential equations and the Legendre matrix differential equation may be extended 
readily to problems of Lagrange with fixed end points. Along an extremal E12 
for such a problem the second variation is of the form (4.1), while the 
equations of variation for the side differential equations are of the form 


A MATRIX DIFFERENTIAL EQUATION OF RICCATI TYPE. 
For simplicity, (4.7) will be written in the matrix form 
9, = (x) + O(x)y =0, 


where = || $g;(x) || and = || || are m X n matrices. A given 
extremal /;. is termed non-singular if the corresponding (n+ m)-rowed 
square matrix 


| R(x) (2) || 
4,8 = | 
is non-singular on 2,72. The inverse of such a-matrix (4.8) is then of the form 
|| T(x) F(x) || 
4.9 (a) lls 
(2) T(z) || 


where 7'(x) and ¢(x) are symmetric square matrices of orders n and m, 
respectively, and +(x) is an m X n matrix. 

The canonical form of the accessory differential equations is then of the 
form (4.2), with the coefficient matrices A(x), B(x), C(x) given by 


A=—(TQ4+76), B=T, C=P—QTQ— Qr0 — 6rQ — 610. 


Again, in view of the symmetry of the matrices B(x), C(x), we know that 
there exists a solution Y(x), Z(x) of (4.4) such that Y(z) is non-singular 
and Y(«)Z(z) is symmetric on 2,22 if and only if there exists a continuous 
symmetric solution W(zx) of the equation (4.5) on this interval. In terms of 
the coefficients of (4.1) and (4.7) the equation (4.5) now becomes 


(4.5”) [Q(x) — W, =o. 


Finally, corresponding to (4.6) we have the identity 
(4. 10) 20 (a, + y, + 


where » = [7(W— Q) —t6]n, and + [T(Q— W) +764]. The iden- 
tity (4.10) is equivalent to the Clebsch transformation of the second variation, 
(see Bliss [1], Sec. 32, and [2], Sec. 22).: For a discussion of the relation 
between the non-existence on Fj. of a point which is conjugate to 1 and the 
existence of a solution Y(x), Z(x) of (4.4) of the above described type, the 
reader is referred to Sec. 23 of Bliss [2], and to the therein referred to papers 
of Hestenes, Morse and Reid. 


NORTHWESTERN UNIVERSITY. 


245 
e 
e 
n 
n 
a 
iC 
Ww 
ly 
T, 
ly 
ve 
ic 
of 
ar 
h- 
ve 
on 
ed 
12 
he 


246 


WILLIAM T. REID. 


BIBLIOGRAPHY. 


A. Bliss, “The problem of Lagrange in the calculus of variations,” American 
Journal of Mathematics, vol. 52 (1930), pp. 673-744. 


. A. Bliss, The Problem of Bolza in the Calculus of Variations, mimeographed 


notes of lectures delivered at the University of Chicago, Winter Quarter, 
1935. 

A. Bliss, The Calculus of Variations in Three-Space, mimeographed notes of 
revised lectures delivered at the University of Chicago, Winter Quarter, 


1938, and in preceding years. 

Bolza, Vorlesungen iiber Variationsrechnung, Teubner, 1909. 

. T. Reid, “ Some remarks on linear differential systems,” Bulletin of the Ameri- 
can Mathematical Society, vol. 45 (1939), pp. 414-419. 

M. Whyburn, “ Matrix differential equations,” American Journal of Mathe- 


matics, vol. 56 (1934), pp. 587-592. 


(2) 
[4] O. 
[5] W 
[6] W] 


REVERSIBILITY AND TWO-DIMENSIONAL AIRFOIL THEORY.* 


By GARRETT BIRKHOFF. 


Introduction. The present paper is concerned with the need for funda- 
mental revisions in the hypotheses underlying mathematical hydrodynamics, 
with especial reference to two-dimensional airfoil theory. 

It is first shown, in 2-8, that any reversible theory of lift and drag must 
be incomplete or grossly incorrect, and remarked that conventional two- 
dimensional airfoil theory is reversible. This is why it is only applicable to 
small angles of attack. 

The theory of lift, developed in 4, is shown to be unreliable even.for such 
angles. It leads to absurd conclusions for simple “ pathological ” shapes (5), 
predicts erroneously the effect on lift of increasing airfoil thickness (5), and 
predicts correctly the effect of camber on lift only at high Reynolds’ numbers 
(6). The effect of camber on the stalling angle is predicted by reversibility 
considerations not mentioned in conventional airfoil theory (6). 

The question of whether specific reversible hydrodynamical theories are 
incomplete (or incorrect) is shown to lead to interesting unsolved mathe- 
matical problems (7), Finally, the reversibility of various other hydro- 
dynamical theories is discussed (8). 


2. The reversibility paradox. The terminology used below is standard. 
As regards notation, we shall denote position in space by x = (%,° * -,@n); 
time by ¢, velocity components by wi, pressure by p, and density by p. This 
notation permits effortless extension to n dimensions of much of the theory of 


the two-dimensional case. 


Definition 1. By the reverse of a given flow, described in Lagrangian 
coordinates, we mean that flow obtained from the given flow by the substitution 
t——t. In Eulerian coordinates, this corresponds to the reversal ui —>— ui 
of velocity direction as well as reversal t—»—1¢ of time, but pressure p and 
density p are still preserved at corresponding points (%,° - +,%n3¢) of space- 
time. 


Definition 2. A condition on flows is reversible if, whenever it holds for a 


* Received October 4, 1945. 


n 

d 

r, 

of 

r, 

land 
24% 


248 GARRETT BIRKHOFF. 


flow, it holds for the reverse of that flow. A theory of fluid dynamics is a 
reversible theory if all the conditions which it imposes on flows are reversible, 

Actually, all the familiar conditions on flows are reversible, except those 
which involve viscosity (friction), thermal conduction, diffusion, or shock 


wave fronts. 


Definition 3. A theory of fluid dynamics will be called incomplete if its 
conditions do not determine the steady flow around a body uniquely ; incorrect 
if its predictions do not agree closely with experimental fact. 


THEOREM 1 (Reversibility Parador'). Any reversible theory of fluid 
dynamics is either incomplete or grossly incorrect, so far as its predictions of 
steady state lift and drag are concerned. 


Proof. Such a theory will predict that a steady flow and its reverse will 
give the same pressure thrust on an obstacle, whereas it is a matter of common 
experience that a flow and its reverse ordinarily give pressure thrusts in ap- 


proximately opposite directions. 


THEOREM 2 (Restricted d’Alembert Parador). Any complete reversible 
theory of fluid dynamics must predict zero lift and drag for steady flow about 
a body symmetric in a point or in a plane perpendicular to the line of flow. 


Proof. We assume tacitly that the theory is invariant under rigid trans- 
formations of space—hence that reflection in a point or in a plane of symmetry 
replaces each flow permitted by the theory. But with a steady flow, the same 
effect can be achieved by reversing time, as in Definition 1. Hence, if the 
theory is complete, the two must be identical, and the pressure distribution 
must have the symmetry described. The conclusion is now obvious. 


Remark 1. In classical hydrodynamics, a stronger result is known: * the 


1In special instances, this principle is not new; cf. for example C. Cranz, Handbuch 
der Ballistik, Teubner, 1913, vol. 1, Chap. II, and especially P. Painlevé, Legons sur la 
résistance des fluides, Paris, 1930, p. 145. Painlevé applies the Reversibility Paradox 
to classical hydrodynamics, and relates it to the d’Alembert paradox. 

Some vaguely related questions are discussed by J. Meixner, “ Reversible motions 
of liquids and gases,” Annalen der Physik, vol. 41 (1942), pp. 409-425. But in general, 
physicists seem to ascribe only philosophical interest to the fact that the concept of 
non-viscous fluid motion is reversible. Thus the fact is not mentioned at all in [1], 


[2], or [4]. 
For a direct experimental description of the irreversibility, cf. N.A.C.A. Tech. 


Mem. 1011. 
2 Assuming (1)-(3) of 3; for the literature, cf. the note by U. Cisotti, Comptes 


Rendus, vol. 178 (1924), p. 1792. 


e, 
se 


ts 
ct 


REVERSIBILITY AND TWO-DIMENSIONAL AIRFOIL THEORY. 249 


d’Alembert paradox asserts that the predicted lift and drag are zero even for 
unsymmetrical bodies inclined at any angle to the flow. However, even in this 
case Theorem 2 is not without interest, because of the extraordinary simplicity 
of the proof. Existing proofs of the d’Alembert paradox are highly com- 
plicated, and open to question on the score of rigor in the compressible case. 


Remark 2. The preceding proof breaks down if the restriction to steady 
flow is omitted. Thus reversible hydrodynamic theory may predict correctly 
the non-zero drag and lift due to acceleration (virtual mass). Cf. [2], p. 235, 
p. 419. It also may give a rough idea of the overturn moment to be expected 
from steady flow. 


Remark 3. The d’Alembert paradox shows that the common practice ® 
of separating head drag and tail drag in theoretical computations is inad- 
missible. No matter how long the mid-section of the body, the shape of the 
tail must theoretically influence the total thrust on the head just as much as 
the shape of the head itself—since the total thrust is zero. 


3. Application to airfoil theory. The classical theory of the steady 
flow of compressible and incompressible non-viscous fluids (cf. [1], Chaps. 
IV-VI, [2]) is based on the following mathematical hypotheses. Each mathe- 
matical hypothesis, in turn, is based on plausible physical arguments. 

There are hypothecated the equation of continuity 


(1) 0(pue) 0; 


the condition that the vector velocity u(a) be tangent to every solid surface; 
equation of motion 


(2) (1/p) dp /da, = 0; 
k=1 


the condition of uniform flow, which postulates that as r—> o the limits of 
ui(z), p(x), and p(x) exist; the existence of a single-valued velocity potential 


(3) U(xi,- such that u; = 0U/dx; for all 1+—1,---,n; 


the condition that p determine p by a thermally controlled equation of state 
p=f(p), whose precise form is obtained from physical considerations. 

The completeness of these conditions is discussed in 7. Substituting in 

*Used in the theory of computing pressure distributions on airship hulls; cf. 
N.A.C. A. Report 516, Tech. Mem. 574, also Goldstein, p. 458. 


a 
id 
of 
ll 
mn 
p- 
le 
ut 
y 
ne 
ye 
la — 
ns 
Lom 
of 
1, 


250 GARRETT BIRKHOFF. 


Definitions 1-2 above, it is evident that, insofar as they express the classical 

theory, the classical theory of the steady flow of non-viscous fluids is reversible, 
In two-dimensional airfoil theory (cf. [2], Chap. VII, [3], or [4]) the 

hypothesis (3) of irrotationality in the large is replaced by the slightly weaker 

assumption 

(3’) Ou; /Ox;, = Ou, /0x; for all i,k 


of irrotationality in the small, and then adding the plausibility hypothesis (4): 
velocity is finite at the sharp airfoil edge. Again it is evident that two- 
dimensional airfoil theory is reversble. 

It is a corollary of this observation and Theorem 1 that two-dimensional 
airfoil theory can only give correct predictions of lift or drag for a limited 
range of “angles of attack” (i.e., orientations with respect to the air flow), 
In fact ([4], pp. 148-151) it fails grossly when the angle of attack exceeds 
the “ stalling angle” of 15°-25°. 

Although few numerical calculations have actually been made in the com- 
pressible case, we know a priori by Theorem 1 that there is no hope that taking 
account of compressibility will remove the limitation of the applicability of 
two-dimensional airfoil theory.to small angles of attack. 


4, Theory of lift: incompressible case. In the case p= const. of an 
incompressible fluid, there are many numerical calculations based on (1)-(2)- 
(3’)-(4) which can be compared with experiment. 

The predicted drag is still zero (cf. [4], p. 165), and in practice two- 
dimensional airfoil theory is never used to estimate drag (which depends on 
“ streamlining ”) for this reason.‘ 

However, at small angles of attack, the predicted lift Z and lift coefficient 
C7, (as defined in [2] or [4]) are frequently supposed to represent good esti- 
mates. The predicted lift for a given airfoil shape A is most easily found 
from the Kutta-Joukowsky Theorem, which asserts ([4], p. 163) that the lift 
on A is the product (for a fluid of density one) 


(5) L= | Ucc | 
e k=1 


of the speed | uco | of the airflow by the “circulation ” around the airfoil. 


Thus the integral formulas of Newton and Euler (cf. P. 

Inci- 


* Historical remarks. 
Painlevé or C. Cranz, op. cit.) are more suggestive, though very misleading. 
dentally, Newton thought that what is usually referred to as the “ Newtonian Theory’ 
applied to gases, but not to liquids. Thus he believed the drag of a smooth convex 


body in a liquid to depend only on its maximum cross-section. (Book II, Lemmas 4, 5; 
Theorem 29). 


— 


REVERSIBILITY AND TWO-DIMENSIONAL AIRFOIL THEORY. 


But now if A is represented on the complex z-plane, any schlicht con- 


formal transformation 


of the exterior of A will carry potential flows with circulation U(z) into 
potential flows U(w(z)) around the exterior of the image of A, which have 
the same circulation and the same velocity at infinity. Hence by the Kutta- 
Joukowsky Theorem, the lift Z is the same, and so the lift coefficient Cz, (based 
on the diameter) is inversely proportional to the diameter of the transform. 
By the fundamental existence and uniqueness theorems of conformal mapping,° 
this determines the C, of a region of general shape; the argument is due to 
von Mises [3]. 

But it is known that any schlicht map 


of the interior of the unit circle | | < 1 carries it into a region not included 
in any circle of radius less than one, but including all the circle ® | w | < .5. 

Under inversion, these inequalities imply that (6) maps the exterior of 
the unit circle onto the exterior of a region of diameter at least two and at 
most four. For a circle, Cp = 4sin@ (cf. [4]), where 6 is the “angle of 
attack ” between the direction of flow and the radius to the rear stagnation 
point. Hence for a general profile A 


(7) = C,* sin — 6) where C,* S 47; 
this result is in [3], which is however hardly available to English-speaking 
readers. 

It is another corollary that for equal angles of attack and wind velocities, 
if A contains a profile A,, then the predicted lift exerted by A exceeds the 
predicted lift exerted by? A,. Finally, it follows that the predicted lift coeffi- 
cient varies continuously with the shape, if the position of the sharp edge is 
held fixed. (To see this, reduce to the case of the circle; for all shapes between 
and |z|=1+ 6, 4r(1— 2) S4r.) Incidentally, C*, 
is proportional to two-dimensional electrostatic capacity, to which these results 
also apply. 


° Cf. C. Carathéodory, Conformal Mapping, Cambridge University Tracts, p. 70. 

° The first inequality has an elementary proof; the area of the image circle is by 
computation (1+ | a, |*+|a;|*-+---.-) 2a. The second is Bieberach’s Verzerrungs- 
satz; ef. L. Bieberbach, Lehrbuch der Funktionentheorie, Vol. II (1927), Section 9. 

*This is also a direct corollary of Carleman’s Principle of Regional Extension; 
ef. R. Nevanlinna, Hindeutige analytische Funktionen, Springer, 1936. 


251 

le. 
cer 

19- 

ed 
m- 
ng 

of 

an 

on 
ont 
sti- 

nd 

ift 

— 

ici- 

§ 
vex 

5; 


252 GARRETT BIRKHOFF. 


5. Comparison with experiment: symmetric airfoils. An objective 
comparison of predicted lift with observed lift brings to light various 
discrepancies. 

In the first place, the importance of the sharp edge is greatly exaggerated. 
Thus by transforming the circle | z + 99 | = 100 under z>z-+ 1/2, we get 
a “ Joukowski profile” consisting of a near circle with a small thorn. The 
predicted lift of this is as great as that of a thin wing having twice the 
diameter—which is absurd. By transformations w—aw-+ b/w (a,b real), 
we can map the circle with a tiny thorn onto a thin ellipse, with a thorn at 
the end of a minor axis. Theoretically, tilting the major axis will cause an 
excess of pressure on the downstream side, which is even more absurd. 

In the case of conventional airfoil shapes, other discrepancies occur. 
Here the basic shape is a straight line segment, slightly rounded at the leading 
edge (cf. [3], [4]). This is varied by introducing added thickness, and 
camber (or curvature, usually downward). 

For shapes approximating a straight line segment, the conventional theory 
predicts ° Cr, = 2msin@. Observed values at small angles of attack are con- 
sistently ° about 10%-25% less than this. 

For symmetric airfoil shapes, the predicted in 
increases with thickness; the observed C,* decreases with thickness. Thus 
the sign of the differential change is wrong,’° and also the absolute discrepancy 
with theory is even greater than 30%. 

Since the sign of the differential change is incorrect, we infer that two- 
dimensional airfoil theory cannot be expected to predict correctly differential 
changes in lift due to small changes in profile shape. This does not imply, 
however, that the theory will not give a good idea of the differential changes 
in flow patterns and pressure distributions due to small changes in profile 


shape.” 


®In the limiting case of a flat plate as noted by Cisotti (Rendic. Lincei, 1927), the 
Kutta-Joukowski Theorem fails. This is because of infinite pressure per unit area at 
the leading edge, which may be interpreted as giving finite thrust on the leading edge. 
This Paradox of Cisotti also applies to circular arc profiles. 

° Cf. [4], pp. 148-151; also NV. A.C. A. Technical Report 244, Refs. 506, 508; Report 
628, Figs. 5, 11. Also Phil. Trans, 225A (1925), pp. 199-245. 

20 Cf. also Durand, Aerodynamic Theory, vol. 2, p. 71. Th. von Karman suggested 
to the author the following interpretation. The actual flow differs from the predicted 
flow in that a thick wake is shed by the downstream rear end of the airfoil. Thus ina 
certain sense, the effective angle of attack is less than the nominal angle of attack. 
The force of this argument is greatest for thick airfoils, which have the thickest wakes. 

11 Methods of computing these have been worked out by M. Munk, and by T. 
Theodorsen; cf. N.A.C.A. Report 411. The author would take the statement on p. 10 
that “The moments about any required axis may be found” with reservation, as it 


‘ive 


REVERSIBILITY AND TWO-DIMENSIONAL AIRFOIL THEORY. 253 


6. Cambered airfoils, If 6, is the angle between the chord joining the 
opposite ends of an airfoil and the airflow when C, = 0, the lift coefficient is 


(7) Cr = C_* sin (6 — 


both theoretically and (approximately) experimentally when @— 4 is small. 
With a symmetric airfoil, 6) = 0 by symmetry. 

In the typical case of shapes approximating § a circular arc, the theoretical 
values of C,*, 4 are easily found. Under the transformation 


(8) w=2+1/z, or v=—y—y/r, 


the inverse image of a general circle u* + v?+ 2Cv =—4 through (+ 2,0) 
is the locus 1* -+ 1 + 2(a?— y?) + 2Cy(r? — 1) = 4r’. 
If C=—A+1/A, this can be factored into 


(r? — 1— 2Ay) (r? —1+ 2y/A) =0. 


Hence (8) maps the two orthogonal circles r?>-—-2Ay=—1, r?+ 2y/A 
= 1 through (+ 1,0) into u? + v? + 2Cv=—4. In particular ([4], p. 179), 
the exterior of 2° + y*— 2Ay = 1 is mapped on the exterior of that part of 
the are of u® + v? + 2Cv =—4 in the upper half-plane. The “camber” of 
this are, which passes through (0,2A), is defined as the ratio (maximum 
distance from chord)/(chord length) = 24/4 = A/2. 

A flow around the circular are has finite velocity at the trailing edge 
(— 2,0) if and only if the flow around the circle 2° + y*— 2Ay=—1 has 
zero velocity at (—1,0). This is true if the flow is parallel to the diameter 
through (— 1,0), which passes through (0, A) and hence makes an angle of 
6. == — Tan“ A with the chord. Furthermore, since the circle mapped on the 
are has a radius V1 + A®, as compared with 1 fora straight line, we have 
in for “ camber ” A/2 


(9) O,* = A?, =— Tan A. 


However, the experimental facts are more complex. The original experiments 
of Eiffel ** at low Reynolds numbers indicated that down-camber produced a 
very large increase in C*, = (dC,/d@)max; moreover the observed 0) was 


refers to differentials of integrated pressure effects, and these do not seem to be cor- 
rectly predicted by theory. 

12 Of. G. Eiffel, The Resistance of the Air and Aviation, translated by J. Hunsaker, 
London and Boston, 1913. Thus (p. 482) with an estimated “camber” A/2 of 3/20, 
an increase in C,* of about 50% was observed; one of only 4.5% being predicted. 
Moreover 0, — 2° instead of the predicted —15°. Cf. also [4], p. 151, where with 
an estimated camber of 3/20, also N. A.C. A. Rep. 93, Refs. 77, 117. 


us 
ed. 
get 
“he 
the 
il), 
an 
ing 
al 
ry 
n- 
icy 
ial 
sly, 
res 
ile 
the 
at 
ort 
ted 
ted 
na 
ck. 
ces, 
T. 
10 
it 


254 GARRETT BIRKHOFF. 


nearly zero. Later experiments ® at intermediate Reynolds numbers indicated 
a smaller but considerable increase in C';*, and a larger negative 6). The most 
recent experiments at high Reynolds numbers find, as predicted, a very small 
increase in (,* with down-camber and a change in the angle 6, of zero lift 
quite near to that predicted.** 

In summary, the predicted variations in C,* and 6, due to camber are 
observed at high Reynolds numbers, but not at low ones. 

The following theoretical explanation of the experimental fact that down- 
camber increases the effective stalling angle and maximum lift is perhaps also 
of interest. By reversibility, nature should abhor infinite velocities at the 
leading edge just as much as at the trailing edge of an airfoil. Physically, 
high relative velocities at the leading edge will cause separation there, i.e., 
stalling. But with a circular are airfoil of camber y = 4/2, finite velocity 
at the ends corresponds to having zero velocity at the inverse image points 
(+ 1,0) of the circle mapped on the airfoil by (8). By reversibility, this 
corresponds to a horizontal wind direction, and hence to a zero angle of attack, 
or by (9) effective angle of attack (0 —6,)—Tan*A. Thus camber y 
should not change the stalling angle, should increase the effective stalling 
angle by Tan™! 2y = 2y, and so (using an empirical value C;,* = 4) increase 
the maximum lift by roughly 8y. 

7. Mathematical completeness, After the discussion of Sections 2-8, 
there still remains the academic question of whether classical hydrodynamics 
and modern airfoil theory are incomplete or incorrect. As this question leads 
to important unsolved problems in pure mathematics which have not attracted 
general attention either among pure or applied mathematicians, it seems worth 
discussing here. 

The answer is threefold. In the incompressible case, both theories have 
been proved ** to be complete, by rigorous mathematical arguments. Actually, 
for certain shapes in three dimensions, the classical theory is overdetermined ; 
no solution is possible. In the compressible subsonic case, it is guessed *° that 


13 Cf, N.A.0.A. Tech. Reps. 460, 628 by E. N. Jacobs, where a variable density 
wind-tunnel at 20 atm. pressure and 70-100f/s wind speed were used. 

14 For the classical theory, the most general proof is that of L. Lichtenstein, Hydro- 
dynamik, p. 422. This applies to non-homogeneous incompressible fluids, and depends 
on the theory of linear partial differential equations of elliptic type; in the homo- 
geneous case, only potential theory is needed. For modern airfoil theory (homogeneous 
case), completeness follows from the ref. of footnote 5. 

15 Cf. H. Bateman, “ Two-dimensional motion of a compressible fluid,’ Proceedings 
of the Royal Society of London, 125 A (1929), pp. 598-618, where the difference between 
the subsonic and supersonic cases is brought out; also Th. von Karmén, “ Compressi- 
bility effects in aerodynamics,” J. Aer. Sci., vol. 8 (1941), pp. 337-356, and the review 


ted 
ost 
all 


ure 


REVERSIBILITY AND TWO-DIMENSIONAL AIRFOIL THEORY. 


the classical theory is determinate or “ complete ” ; at all events, it is equivalent 
to a non-linear partial differential equation of elliptic type with boundary 
conditions. In the supersonic case, one gets a differential equation of hyper- 
bolic type, and therefore one may guess that there are many solutions. By 
putting further irreversible conditions on the solution, it may be possible to 
get a correct complete theory of drag.’® Little is known about the theory of 
circulation for compressible fluids. 

We thus find several unsolved problems in pure mathematics which have 
important applications to theoretical hydrodynamics. (1) The uniqueness 
of the solution of the differential equations with boundary conditions corre- 
sponding to subsonic flow, of compressible non-viscous fluids. (2) Existence 
of a solution to the same. (3) Corresponding problems (with presumably a 
contrary answer as regards uniqueness) for supersonic flow. (4) A rigorous 
discussion of the relation betwen the uniformity hypothesis (3) as usually 
stated, and the analyticity at infinity usually tacitly assumed." It is at this 
point that the author questions the rigor of the proofs of the d’Alembert 
paradox for compressible fluids which exist in the literature; the point acquires 
special importance in view of the fact that the assumption of analyticity at 
infinity may be too strong to permit solutions. 

In the two-dimensional case, one can reduce to linear partial differential 
equations by use of the hodograph method,’* but the boundary conditions 
become very involved. The author proposes to apply this method also to the 
discussion of the behavior of compressible fluids at infinity and at cusps 


(which applies to airfoil theory) in a later paper. 


8. Other aspects of reversibility. In any discussion of reversibility, 
it should be recalled that the kinetic theory of gases is reversible in the sense 


thereof in Mathematical Reviews (1942), p. 220. Incidentally, the statement on p. 8, 
lines 7-10 of von Karmén’s “ The problem of resistance in compressible fluids,’ Reale 
Accademia d'Italia (1936-XIV), seems to be contradicted by the example of the 
Lebesgue spine in potential theory. For the supersonic case, cf. the latter paper, 
esp. p. 17. The author has interpreted these papers freely, with the idea of expressing 
what seem to be the most expert guesses by applied mathematicians. 

1° This is essentially what is done in Th. von Karman and N. B. Moore, “ Resistance 
of slender bodies, etc.,’ Transactions of the American Society of Mechanical Engineers 
(1932), pp. 303-310. 

17 For rigorous discussions bearing on the incompressible case, cf. O. D. Kellogg, 
Potential Theory, now being reprinted in this country, Chap. X, Section 8, and Chapter 
VIII, Section 3. The case of two dimensions reduces to complex variable theory. 

18Cf. Stefan Bergman, The Hodograph Method in the Theory of Compressible 
Fluids, Providence (Brown University), 1942. The author is indebted to Dr. Bergman 
for many stimulating conversations, and especially for the point made here. 


255 
m- 
lso 
he 
ly, 
ity 
its 
*k, 
ng 
3, 
cs 
ds 
°° °° °° 
ve 
&§ 
i; 
at 
ty 
1s 
0- 
1s 
gs 


256 GARRETT BIRKHOFF. 


of the dynamics of systems of particles.’° Thus for a hydrodynamical theory 
to be irreversible, it must involve statistical mixing effects tending to dissipate 
energy (increase entropy). 

Various other theories are reversible as far as general principles of fluid 
mechanics are concerned, and are irreversible only because of some single 
special assumption. 

This is true of the Helmholtz-Kirchoff theory of wake ([2], Chap. XII), 
which is reversible except for the empirical postulation of a stagnant wake 
behind instead of ahead of the obstacle. Also, the von Karmén-Moore approxi- 
mate theory of drag,’® for bullets moving at supersonic speeds, is reversible 
except where they select arbitrarily the rear instead of the forward real 
portion of the complex solution of a hyperbolic differential equation. 

Perhaps some of the confusion which seems to remain *° in the funda- 
mentals of the theory of resistance to surface waves can be attributed to the 
difficulty of finding irreversible hypotheses of a general nature. 

In concluding, we note the difficulty of constructing any smooth theory 
which will predict a drag proportional to the square of the velocity. Indeed, 
the vector drag is proportional to u? if u > 0 and to — u? if wu < 0, under such 
circumstances. Hence we have a singularity at u = 0, in the sense of analytic 
functions; this must correspond to a singularity in the theory. In Newton’s 
theory,* this is due to a reversal] of the face on which wind is supposed to 
press; in the Helmholtz-Kirchoff theory of wake, to a reversal of the location 
of wake. In general, the difficulty in continuing a theory through u = 0 may 
be ascribed to the Reversibility Paradox. It is easy to give plausible general 
reasons why drag should be proportional to u?, but difficult to explain the 
reversal in sign of drag with reversal in sign of u, by a single theory. 


HARVARD UNIVERSITY. 


BIBLIOGRAPHY. 


[1] H. Lamb, Hydrodynamics, 6th Edition, Cambridge University Press. 

[2] L. M. Milne-Thompson, Theoretical Hydrodynamics, London, 1938. 

[3] R. von Mises, Zeits, Flugt. u. Motorluftschiffahrt, 1917, pp. 157-163, and 1920, 
pp. 67-73 and 87-89. 

[4] Prandtl-Tietjens, Applied Hydro- and Aeromechanics, McGraw-Hill, 1934. 


1° Cf. for example G. D. Birkhoff, Dynamical Systems, New York, 1927, p. 27. In 
the n-body problem, Lagrangian coordinates (in the hydrodynamical sense) are used. 

20Cf. [1], Sections 255-256, Section 249. There seem to be various explanations 
of wave resistance. 


20, 


A LIMIT THEOREM FOR RANDOM VARIABLES WITH 
INFINITE MOMENTS.* 


By W. FELLER. 


Let {Xx} be an arbitrary sequence of mutually independent random 
variables and {dn} a monotonic numerical sequence. As usual, we put 


(1) S,=—X,+-: Xn. 


We consider the 

Event £: “The inequality 
(2) | Sn| > an 
takes place for infinitely many n.” 


According to the familiar “one or naught law,” the probability of & 
can be only zero or one. For the case where the Xx are individually bounded, 
an extensive theory has been developed and a recent refinement of Kolmogoroff’s 
law of the iterated logarithm? enables us to decide in any special case whether 
the probability is zero or one. Now this theory depends essentially on the cen- 
tral limit theorem. As soon as we leave the domain of applicability of the 
central limit theorem we find ourselves on practically unknown terrain; the 
problems receive an entirely new aspect and no systematic tools have as yet 
been developed for treating the theory. Outside of the theory of the iterated 
logarithm only one result seems to be known. The following theorem treats a 
very special case and is given in a form whose probability meaning is not 
readily intelligible. Its interest is nevertheless considerable inasmuch as it 
shows the radical change in the character of the limit theorems caused by the 


absence of finite moments. 


THEOREM (P. Lévy-J. Marcinkiewicz).* Suppose that one has uniformly 
for large x and all k 


* Received November 26, 1945. 

1W. Feller, “ The general form of the so-called law of the iterated logarithm,” 
Transactions of the American Mathematical Society, vol. 54 (1943), pp. 373-402. 

*P. Lévy, “Sur les séries dont les termes sont des variables éventuelles indépen- 
dantes,” Studia Mathematica, vol. 3 (1931), pp. 117-155; J. Marcinkiewicz, “ Quelques 
théorémes de la théorie des probabilités,” Travaua de la Société des Sciences et des 


257 


ry 
id 
le 
ke 
‘al 
la- 
he 
Ty 
d, 
ch 
tic 
1’s 
to 
on 
ay 
ral 
he 
| 


258 W. FELLER. 


(3) < Pr {| X; | > 2} < Cr 


where a, c and C are positive constants. Let A(t) be an increasing function 
such that A(2t)/A(t) ~last—>o. For the special sequence 


(4) am = {n log n A(log n) }3/4 


and0<a< 1 the event & has probability one (zero) if the series 
(5) dS{na(n)}7 


diverges (converges) ; the theorem remains true also for 1S a < 2 provided 
that E (Xx) =0 (in case « =1 the Cauchy principal value of the expectation 


is meant). 


This theorem was proved by P. Lévy in the case 0 << 4 <1 using the 
theory of stable distributions. The method does not work for a= 1. Marcin- 
kiewicz’s proof is of an elementary nature. The striking difference between 
the P. Lévy—Marcinkiewicz theorem and the law of the iterated logarithm 
becomes apparent if one notices that according to the former any sequence 
{a} exhibits the same character as {Ma,}, where M>0 is an arbitrary 
constant. With the iterated logarithm all that can be said is that {¢,} and 
{¢n + M/dn} belong to the same class. In particular, {y log log n}” would 
belong to the upper class if 7 > 2, to the lower if 7 < 2. 

Consider now the 


Event £*: “The inequality 
(6) | Xn | > adn 


takes place for infinitely many n.” , 


A simple computation shows that, in the cases where the P. Lévy- 
Marcinkiewicz theorem applies, the events @ and #* have the same proba- 
bility. In other words: if the conditions (3) and (4) are satisfied, the 
asymptotic behavior of the sums {S,} is entirely determined by the last terms 
X,: as far as maxima are concerned, S,-, can be neglected in comparison 
with X,. Considered against the subtle background of the iterated logarithm, 
this behavior is very crude indeed. We shall see that it is typical for the case 
of infinite variance and also that surprisingly simple analytic methods suffice 
to treat this case. (The theory becomes the simpler the fewer moments are 
finite. ) 


Lettres de Wilno, Classe des Sciences Mathématiques et Naturelles, vol. 13 (1939), 
pp. 1-13. 


i 


A LIMIT THEOREM FOR RANDOM VARIABLES. 259 
For simplicity we shall consider only the case where all X;, have the same 
distribution function 


(7) Pr {X, = 2} — V(z). 


This condition can trivially be relaxed in the direction of (3) where it is only 
required that the distribution functions of the X; should not differ too much 
from a given distribution function. However, the methods of this paper do 
not apply to the general case of distribution functions varying with k. The 
fact that absolute values are introduced in (2) is formally another restriction: 
however, it simplifies the formulation and it is not difficult to separate the 
cases Sn > dn and S, — adn. 


THEOREM 1. Suppose that for some 0<8<1 


+00 


-0O 
but that the first moment exists and 


+00 


(9) f xdV (x) = 0. 


For the particular sequences an = n+ and an =n the event £ has proba- 
bility one and zero, respectively. For any sequence {an} for which there exists 


ane with OS <1 such that® 
(10) Ayn an/n | 
the probability of & is one or zero according as the series 


(11) dV (x) 


diverges or converges. 


* The restriction (10) seems so natural and mild that no effort has been made to 
remove it. Actually, the proof requires much less than (10), namely 


(10a) n/a, < Const. (n + 
and 

(10b) > a,-2 = O(na,-?). 

ven 


The last condition is certainly satisfied if, for example, lim inf (@,,2/@,2) > 2. 


e 

n 
n 
d 

le — 

yn 
n, 
ce 
re 


260 W. FELLER. 


THEOREM 2. If 
(12) f 


and an =n, the event £ has probability one. For any sequence {an} with 
(13) T 
the probability of & is one or zero according as (11) diverges or converges. 


The statement of either theorem can be reformulated to the effect that 
the probability of & is the same as that of 2*. 


Proof. If (11) diverges, the implication is trivial. It suffices therefore 
to assume that (11) converges. Put 


(14) f (x) 


la| < ay 
and 
if | <u 
15 xX’, = 
(15) 0 if |X. a 


The probability that X,=4 X%,-+ yx is given by the general term of (11). 
As this series converges, we have with probability one 


(16) Sy — (X's + pe) = O(1) = 0(an). 

In order to prove that, with probability one, 

(17) = 0(an) 

it suffices according to a frequently used trick * to show that with certainty 


Now X’; has vanishing expectation, and according to a theorem of Khintchine 
and Kolmogoroff it suffices to prove that 


The method of proving (17) from (18), applying Kronecker’s theorem, is due to 
H. Rademacher (“ Einige Siitze iiber Reihen von allgemeinen Orthogonalfunktionen,” 
Mathematische Annalen, vol. 87 (1922), pp. 112-138). Its usefulness in proving general 
probability limit theorems has been demonstrated by A. Zygmund (in an article, written 
in Polish, in Mathesis Polska, vol. 8 (1933), pp. 76-87, where the real variable treat- 
ment of the theory of probability was outlined). Subsequently, the method has been 
extensively used in the Polish probability literature and is by now a familiar tool. 


A LIMIT THEOREM FOR RANDOM VARIABLES. 261 


(19) 1/ay? f (x < 
Now, putting a) = 0, 


(20) 2*dV (2) = 1/ax? f x*dV (2) 


4= 


ea 
<ai 


The first of the conditions (10) implies that 
k=i 

Therefore the series (20) is less than 


i 


f aV (2) =3/(1—e) f dV 


a-slal<a 


and the latter series converges by assumption. As the terms of (19) are not 
exceeded by those of (20), the convergence of (19), and therefore the certainty 


of (18), have been established. 
In view of (16) and (17) it remains only to prove that 


(21) = 0 (dn). 
k=1 


We consider first the case of Theorem 1. It follows then from (9) that, for 


an arbitrary integer VN and n= N, 


(22) | 1/an Spe | S O(N/an) |x| dV (2) 
k=1 k=N am 
= 0(N/an) + n/an f dV(x) +1/an f dV(a). 
k=N 


<< an 


Now, using the second condition (10), 


] ] 


(23) f | dV (2) <Si/a, f |e | dV(e) S234 dV (c) 


2A f dV(c) +23 
j=n 


262 W. FELLER. 


The last series converges by assumption; the first term to the right tends to 
zero since the integral is the general term of a convergent series with decreasing 
terms. As for the last expression in (22) 


1/an |2| dV (2) f dV (zx), 


apsla| <an |a|=a, 


and the last series becomes arbitrarily small for WN sufficiently large. 
It remains to prove (21) for the case of Theorem 2. Then the series (11) 
cannot converge if a,/n remains bounded. It follows from (13) that 


k=1 k=l 


ene O(Nay/an) n/An | | dV (x) 


= 0(Nay/an) + f |2| dX (zx). 


That this tends to zero has already been shown in the second part of (23). 


CORNELL UNIVERSITY, 
IrHaca, N. Y. 


SCALE HYPERSURFACES FOR CONFORMAL-EUCLIDEAN 
SPACE.* 


By Yune-CHow Wona? 


1. Introduction. We shall give in this paper generalizations to n-space 
of some of the results obtained recently by Kasner and DeCicco (3)? for the 
scale curves in conformal maps of a surface upon a plane. It will be observed 
that this subject is closely connected with the subject of the isoparametric hyper- 
surfaces * of Levi-Civita (4) and Segre (6) and incidentally connected with 
that of the subprojective Riemannian space of Kagan (2) and Schapiro (5). 
For notation and convention we shall follow Eisenhart (1), and in particular, 
we shall confine ourselves to real variables and real functions. 


The fundamental form 


(1.1) ds* = °°, 


represents a conformal-Euclidean n-space Cn, conformable to the Euclidean 
n-space R,.* The scalar curvature of C, with fundamental form (1.1) is 
(Eisenhart 1, p. 90) 


(1. 2) R= (n—1)e*[2Azo + (n — 2) Ajo], 
where 
(1. 3) Aico => (o,:)?, Aw = Dd (+ =—1,- 


are the first and second differential parameters with respect to Rn; the indices 
after the comma indicate partial differentiation. 

The hypersurfaces = const. in Rn» are the scale hypersurfaces in the 
mapping of Cy on Rn. 

Any simple family of hypersurfaces f= const. in Rn is called quasi- 
isothermal if it represents the scale hypersurfaces of a conformal map of some 


* Received May 13, 1945. 

* Harrison Research Fellow at the University of Pennsylvania. 

* References quoted are listed at the end of this paper. 

* We remark that the subject of isoparametric hypersurfaces in a space of constant 
curvature has been investigated by E. Cartan in recent years. 

‘It is found convenient here to write the rectangular Cartesian coordinates in R, 
sometimes as w',-. - -,@" and sometimes as + 


263 


to 


264 YUNG-CHOW WONG. 


Cn on R, such that the scalar curvature of Cy, is constant over each of the 


scale hypersurfaces. 
The family of hypersurfaces f = const. can represent the scale hyper- 


surfaces of that class of Cn for which o —o(f). The scalar curvature for this 


class of Cy is 

(1.4) R= + [20 + (n—2) 
Thus, f = const. is quasi-isothermal if and only if 

(1. 5) 20’ + [20” + (n — 2) (o’)*]A,f = function of f. 


Let F = F(f), then A.F = F’A,f + F’A,f. Therefore, if we choose F 
so that | 
(1. 6) o”/o + 43(n—2)o’ 
then condition (1.5) becomes 
(1. 7) A.F = function of F. 


Hence, f = const. is a family of quasi-isothermal hypersurfaces if and only 
if a function F of f exists such that (1.7) is satisfied. In particular, if F 
is any function satisfying this condition, then F = const. is a family of quasi- 
isothermal hypersurfaces. The scale function o of the corresponding mapping 
is given by (1.6). 

We know that if F satisfies 


(1.8) 4, F = function of F, 


the hypersurfaces F = const. are parallel hypersurfaces. Hypersurfaces F 
=const. for which both (1.7) and (1.8) are satisfied have been called 
isoparametric hypersurfaces by Levi-Civita (4) and Segre (6), who proved 
that 

All (real) isoparametric hypersurfaces in Ry (n= 2) are one of the 
following three types: (a) parallel hyperplanes, (b) concentric hyperspheres, 
(c)*® generalized coaxial cylinders of rotation (i. e., hypersurfaces which are 
generated from a pencil of concentric hyperspheres in an r-plane (r S n — 1) 
by means of the «”*-* translations contained in Rn and orthogonal to the: 
r-plane). 


For n = 2 this is essentially Kasner and DeCicco’s Theorems 2 and 3. 


5 We note that (b) may be considered as an extreme case of (c) when r =n. 


SCALE HYPERSURFACES FOR CONFORMAL-EUCLIDEAN SPACE. 265 


We shall now proceed to prove the following generalizations of Kasner and 
DeCicco’s Theorems 4 and 5: . 


THEOREM 1.1. A family of «+ hyperplanes is a family of quasi- 
isothermal hypersurfaces if and only if tt is a pencil of hyperplanes. The 
fundamental form of the corresponding Cn ts reducible to 


(1. 9) ds? = J (dz,> +++ dan’), 


where J may be constant or not; in the latter case the pencil of hyperplanes 
must be parallel hyperplanes. 


THEOREM 1.2. A family of «+ generalized cylinders of rotation which 
are generated from a family of «+ hyperspheres in an r-plane is a family of 
quasi-isothermal hypersurfaces if and only if it is (a) a pencil of coaaral 
cylinders for n > 2, or (b) a pencil of circles for n=2. The fundamental 
form of the corresponding Cn is reducible to 


(1.10)¢ ds? +: (Orn), 
or 


10), ds? = J X2") (dx,* dz"), 


respectively, where in (1.10)» J ~const. or =const. according as the pencil 


of circles is concentric or not. 


CoroLLary 1.2. For n > 2, a family of «7 hyperspheres is a family 
of quasi-isothermal hypersurfaces if and only if it ts a pencil of concentric 
hyperspheres. The fundamental form of the corresponding Cy is reducible to 


(1. 11) ds* = J (a? + + + + dt,*). 


By definition a subprojective Riemannian n-space is a Riemannian space 
which can be so mapped on FR, that the images of the geodesics are curves 
lying in planes which pass through a fixed point. Schapiro (5) proved that 
the C,,’s with fundamental forms (1.9) and (1.11) are two of the three and 
only classes of subprojective Riemannian space. Theorems 1.1 and 1. 2 there- 
fore furnish a new characterization of these two classes of subprojective 
tiemannian space. Schapiro also proved that the Cn with fundamental form 
(1.10) can be so mapped on RF» that the images of its geodesics are curves 


lying on (7 + 1)-planes. 


2. Proof of Theorem 1.1. Let F = const. be a family of (non- 
isotropic) hyperplanes in FR, with rectangular Cartesian codrdinates 2” 


6 


F 
Co 
d 
e 
s, 
) 
Le» 
3. 

| 


266 YUNG-CHOW WONG. 


(h,i,7,k& =1,:--,n). Then there exist some definite functions an(F), 
ao(F’) of F such that the equation ‘ 
(2.1) an(F)a" a.(F) =0 


is satisfied identically. Differentiations give 
ai 
+ a)’ 
=== SF ii = — ao’) 
(A.F) j= | — 3A4 (an’’x" ac ) — ( Sa { iy 


where A is some function of 2". From this it follows that the condition (1.7), 
namely, 
(AP) = — (A2F) = 0, 
is 
{— (Sai?) apy” + [— + 3A + ao”) Sai? = 0, 
i.e., since Sa,” ~ 0, 


(2..2) aj” + Ca; 0, 
where C is some function of 2” and 


Saja,’ +- a,” 
(2.3) 


Sa,” + ay’ 

Let us consider (2.2) as n linear equations in B and C. Two cases 
arise according as only one or more than one of the equations in (2.2) are 
independent. For the former case, we have 

(2. 4) aj = a; = aj =bjp", 

where p = p(F), bj = b;(a"). From (2.4) it follows that’ b; = b;(F) 
=const. The family of hyperplanes (2.1) now becomes 


p(F) (brx") ao 0, 


which is a pencil of parallel hyperplanes. Therefore we may suppose that 
p =1, so that a; = b; =const. and equations (2.2) are satisfied by C =0. 
Then we have from (2.1) that 


af—o. 


Therefore equation (1.5) is satisfied. The corresponding C, has the funda- 


SCALE HYPERSURFACES FOR CONFORMAL-EUCLIDEAN SPACE. 267 


mental form J (baxv") (dz,? + dan?), which can of course be reduced to 
J (x1) (da,? +: dan?) by a suitable change of codrdinates. 


We now suppose that in the equations (2.2) in B and C, there are at 
least two independent equations. Then since aj —aj(F), both B and C are 
functions of F. We have therefore from (2.3) 


=8(F), say. 


Moreover, since the aj, as functions of the independent variable /’, are now 
solutions of the linear homogeneous differential equation (2.2), there exist 
two functions p(/’) and q(F’) and some constants b; and cj such that 


(2. 6) aj = bjp(F) + 


It is readily seen that because there are more than one independent equations 
in (2.2), the p and gq and also the 0; and c; are not proportional. 
Using (2.6) in (2.1) and (2.5), we have 


p + + ao = 0, 
(p” — sp’) + (q” — sq’) + a0” — sao’ = 0. 


These two equations must be dependent; otherwise, both bax" and crx" would 
be functions of # and therefore they could differ only by a constant factor, 
i.e., the bj and c; would be proportional. 

This being the case, we have 


(2.8) —sp’=rp, = 19, — = 

where r=r(/F’). From these it follows that 

(2. 9) do = bop (F) + cog (F), (bo, Co constant). 
In consequence of (2.6) and (2.9), equation (2.1) can be written as 

(2.10) + bo) p(F) + + co) q(F) = 0, 


which represents a pencil of non-parallel hyperplanes. 

We now prove that the Cn corresponding to this quasi-isothermal family, 
which consists of a pencil of non-parallel hyperplanes, is itself Euclidean. 
From (2.10) we have 


bra De 


C= o(f), Chat Co > 


= v? (Sdn? -++- Paes), Aof 2v?(— + f3cn*), 


) 

it 


268 YUNG-CHOW WONG. 


where v= 1/(cxa"-+ ¢)). Substitution of these in (1.4) and (1.5) gives 
Rk =v’: (a function of f) = function of f. 


From this it follows that R = 0; for if R+0, the bj would be proportional 
to cj. Therefore the Cy is an Rp. 


3. Proof of Theorem 1.2. Consider a family of «' generalized cylin- 
ders of rotation which are generated from a family of «+ hyperspheres in a 
fixed r-plane in Ry, (r=n). We may suppose, after a suitable change of 
rectangular Cartesian codrdinates, that this r-plane is = = 0. 
Then, if / — const. is this family of cylinders of rotation, there exist some 
definite functions da(F’), ao(F’) of F such that the equation 


(3.1) = 2[an(F)2* + ao(F) (h, 1, 7,4 
is satisfied identically. Equation (3.1) can be written 
(3. 2) ay)? = r? = Say? 


Differentiations of (3.1) give 
Ti — 
an’ + ay’ ’ 
AF = SF 44 = nk — — an) — + a0”), 
(AF); = + [—(n + 2)dA? + Saw’ — an) 
+ 3d*r? + ag”) Ja;’ + AF 


Fy; = A(ri — ai) 


where A is some function of z*. The condition for A.F to be a function of 
F: (AF) = 0 is therefore 


(3. 3) a;” + Ba;’ = C(x; —4;), 
where C is some function of x" and 


(3.4) B= (n+ 2)Atr? — (an — an) — + 


We consider first the case C —0, and then show that the case C~0 
leads to a contradiction. When C = 0, (3.3) is of the form a;” + Ba,’ = 0. 
It follows from this that B = B(/F’) and then that 


al 


of 


SCALE HYPERSURFACES FOR CONFORMAL-EUCLIDEAN SPACE. 269 
(3.5); a; = bip(F) + 


where the b’s (not all zero) and c’s are constants. Furthermore, we now have 
from (3.4) after simplification that 


Ih 
(n — 2)ar’x* — 3 — r? == function of F. 


When we substitute the values aj; from (3.5),, this becomes 


” 
(3. 6) (n — 2) (baa") p’ —3 r? 


p’ + ay’ = function of F. 


If (n— 2)p’ 0, this equation would demand that bax" be a function of 
F, which is contradictory to the hypothesis that /” — const. is a family of 
cylinders of rotation. Therefore we must have (n — 2)p’ =0. 


If p’ = 0, (3.6) is satisfied and (3.5), may be supposed to reduce to 
(3. 7) aj; = cj = const. 


This, together with (3.1), shows that we have a pencil of coazial cylinders. 
If p’ 0, then n = 2. And since bax" ~ function of F, equation (3. 6) 
requires that 
ao p’ — ao p” = 0, 


so that 
(3. 5)» Ao = bop (F) + co (bo, Co constant). 


On account of this and (3. 5),, equation (3.1) represents a pencil of cylinders. 

Hence we have two cases: (3.7) or (3.5) with p’s40. The former 
case can happen for any n(> 1), while the latter case can happen only for 
n = 2, which is the case of Kasner and DeCicco. 


For the case (3.7), equation (3.1) becomes 
Consequently, 


o=oa(f), f= Aif —4(f + Asaf 2n. 


Condition (1.5) is therefore satisfied, and the fundamental form of the 
corresponding C, is reducible to 


Qh 
a 
of 
0. 
ne 
0 


270 YUNG-CHOW WONG. 


ds? J (21? 4- 2,7) (dz,? dtp”). 
For the case (3.5), equation (3.1) becomes 
= 2 + bo) p(F) + 2 + cp). 


Therefore o—o(f) and 


— + co) __ +e 
baa" + bo bry" 


where 6 and ¢ are some constants. From this we have, writing v* = daz" 


bo bry" b, 
= v?[4(bf —c) + 


Aof = 2nv — + 2v? (bn?) (Syn? + €), 
the latter of which becomes, for n = 2, 
Aof = v?[4b + 2f(3dx7) ]. 
Therefore from (1.4) and (1.5) we have 
R =v’: (a function of f) = function of f. 


R must be zero; otherwise, the family of quasi-isothermal hypersurfaces 
f =const. would be vt = bax* + by = const., contradictory to hypothesis. 
Therefore C, is a Euclidean space. 

We now suppose that the C in (3.3) be not zero. If the aj” are all zero, 
(3.3) becomes Ba;’ = C(x; —aj). Squaring this and then summing over J, 
we have B*3a;’? = C’r?, showing that the preceding equation can be written 
2; = function of F. From this it follows that the family of quasi-isothermal 
hypersurfaces F = const. is x; = const., contradictory to our hypothesis. There- 
fore a;” are not all zero. Let us now multiply (3.3) by aj”, aj’, xj —a; 
respectively and then sum each of the results over 7, we have 


Sar”? + (arn an), 
(3. 8) Saran’ + = (rn — an), 


San” (rn an ) Ban’ (xn an ) = 


These are three equations in C, XY = San’ (an — an), Y = (an an), with 
functions of F as coefficients. We shall now prove that these equations can be 


SCALE HYPERSURFACES FOR CONFORMAL-EUCLIDEAN SPACE. 271 


solved for C, X, Y as definite functions of F, and consequently, we would 
have a contradiction with (3.1). 

Eliminating B and C from (3.8) and C from the last two equations 
of (3.8), we have, respectively, 


(3.9) ¥? XY 
+ (San?) X? + 1? [ — (San’?) (San?) ] = 0, 
and 
Saran” + = + BY). 


na” The last equation is, on substitution for the value of B from (3.4), 
(3. 10) r ay” ( San’? — 
[ + (m+ 2) Baran’ + — 


X + Sana’ + a,’ 


Writing this for brevity as 


ry 9 2 +4 
(3.10’) a—XY¥ + (n—2)X + (n+ 2)y—3r? 


and solving for Y, we have 


(n — + Qnyr 2X8 


where the unwritten terms are of the second or lower degree in X. When this 


2 value of Y is substituted in (3.9), the latter becomes a polynomial equation 
in X: 

r0, (San) [ — + + 

(3. 11) (2X? — yX — 3pr?)X 

fen + yX — 

nal + r?[ — (Say 2 — 3Br?)? 

Ki If n ~ 2, the coefficient of X* in (3.11) is (n— 2)*r-*3ay”*, which is not 


zero since the a,’ are not all zero. 
If n = 2, the coefficient of X* in (3.11) is 


(4y1-?) San’? -— 4yr* + 4307” 


= 43 (ay”” — an’)? = 43 (an” — an’)? 


the last equality follows immediately from the definition of r and y. This 


coefficient is zero if and only if 


ith 
be 
aj” — = 0 


272 YUNG-CHOW WONG. 


Comparing this with (3.3) we have equations of the form Eaj’ = C(zj—a)). 
This, as we saw before, would lead to a contradiction. 


Therefore in both cases equation (3.11) requires that XY be a function 


of F. Hence the case C ~0 is impossible, and the proof of Theorem 1. 2 has 


been completed. 


to 


THE UNIVERSITY OF PENNSYLVANIA. 


REFERENCES. 


Eisenhart, L. P., Riemannian Geometry (1926), Princeton. 

Kagan, B., “ Uber eine Erweiterung des Begriffes vom projecktiven Raume und dem 
zugehérigen Absolut,” Abhandlung des Seminars fur Vektor- und Tensor- 
analysis I (1933, Moskau), pp. 12-101. 

Kasner E. and DeCicco, J., “ Geometry of scale curves in conformal maps,” American 
Journal of Mathematics, vol. 67 (1945), pp. 157-166. 

Levi-Civita, T., “ Famiglie di superficie isoparametriche nell’ ordinario spazio 
euclideo,” Atti Accad. naz. Lincei, Rend., (6), vol. 26 (1937), pp. 355-362. 

Schapiro, H., “Uber die Metrik der subprojecktiven Riume,’ Abhandlung des 
Seminars fiir Vecktor- und Tensoranalysis I (1933, Moskau), pp. 102-124. 

Segre, B., “ Famiglie di ipersuperficie isoparametriche neglie spazi euclidei ad un 
qualunque numero di demensioni,” Atti Accad. naz. Lincei, Rend., (6), vol. 27 
(1938), pp. 203-207. 


| 
a 
3. 
4. 
5. 
6. 


A FACTORIZATION OF THE DENSITIES OF THE IDEALS IN 
ALGEBRAIC NUMBER FIELDS.* 


By AuREL WINTNER. 


Introduction. Let ® be an algebraic number field, €(s;8) its zeta- 
function and C—C(R) > 0 the residue of this zeta-function (at s=1). 
According to the fundamental limit-theorem of Dirichlet-Dedekind (cf., e. g., 
[1], pp. 142-149), the (integral) ideals of & have a finite and non-vanishing 
density, that is, the number of the ideals having a norm not exceeding x is 
asymptotically proportional to 7, as x—> «. If Dirichlet’s elementary Abelian 
lemma (cf., e.g., [1], pp. 152-154) is applied to the Dirichlet series of 
£(s;8), it becomes evident that the numerical value of the asymptotic density 
must be the residue (’, provided that the existence of this density is granted. 
What is not evident is the existence of this density. In fact, when Dirichlet 
and Dedekind developed their theory of unities, their main, or rather sole, 
purpose was a proof for the existence of an asymptotic factor of propor- 
tionality (for historical references cf. [7]). However, as was shown in [7], 
this existence theorem can today be proved in a manner which completely 


_ avoids the theory or the existence of unities. 


The positive results of the present paper (some of them will be negative) 
supply for the asymptotic density a peculiar evaluation, rather than an exist- 
ence proof. While the classical evaluation involves such data as the regulator 
and the numbers of real and complex unities in the field, the evaluation which 
will result only contains data depending directly on the laws of factorization 
of the integral ideals. In particular, the residue C will appear as a product 
extended over the sequence of all rational primes. 

This evaluation of the asymptotic density is not difficult to prove; it is 


> 


“elementary ” in the technical sense of the analytic theory of ideals. That it 
does not seem to have been observed before, may be explained by the fact that 
it is fully disguised in the case of the rational field, since ap then becomes 0 
in every factor of the infinite product II(1-+ a») representing the residue 
(which is 1 in case of the rational field). Incidentally, this trivial case pre- 
sents the only field for which the product evaluation is absolutely convergent. 
The result is as follows: 


* Received December 10, 1945. 


m 

io 

2. 

28 

273 


274 AUREL WINTNER. 


With reference to an arbitrary algebraic number field, let 


(1) j=j(p) =1, 


where p= 2,3,: ++, denote the number of the distinct prime ideals which 
divide the principal ideal [p], and let 


(2) 9. =9:(P) =G9im(p) 21 
be the respective degrees of these j prime ideals. Then the product 
(3) p*)(1— p™) "(1 — pm): (1 — pm)”, 

D 


where p runs through the sequence of all rational primes in increasing order, 
is convergent and its value represents the asymptotic density of the integral 
ideals in the field. 


The proof of the representation (3) of the asymptotic density will depend 
on (i) the extension to algebraic number fields of Mertens’ elementary approxi- 
mation to the prime number theorem, and (ii) a particular criterion assuring 
the legitimacy of a formal Eulerian factorization. 

As to (1), it is clear, and well-known, that any of the proofs of Mertens’ 
theorem concerning the rational field (in particular, Mertens’ own proof and 
Hardy’s “'Tauberian” proof; cf. [3], pp. 22, where further references are 
given) can directly be transcribed to the case of an arbitrary algebraic number 
field. However, since all these proofs depend, very explicitly, on Stirling’s 
theorem and break down, therefore, in more general situations, it seemed to 
be worth proving a more general fact, which is the content of the first of the 
assertions of the theorem italicized below. What concerns (ii), that is, the 
legitimacy of the formal Eulerian derangement, assertion (ii) of the same 
theorem will become applicable. Finally, the negative result, (iii), of the 
theorem will show that the legitimacy of the derangement in question is by no 
means automatic. 

An appendix considers the nature of analytical limitations imposed on 
the zeta-function €(s;8) by the laws of factorization in the field &, that is, 
by the specifically arithmetical character of the data (2), (3) of &. 


1. By a multiplicative function F(n) of the positive integer n is meant 
a sequence F(2),: satisfying F(nm) = F(n)F(m) whenever n and 
m are relatively prime. Thus, if the trivial case F(1) = F(2) =---=0 


A FACTORIZATION OF DENSITIES OF IDEALS. 


is excluded, the function /'(n) is uniquely determined by an arbitrary assign- 


ment of its values attained when n is a prime power, n = p*, where p= 2, 3,°° 
and k>0 (if k then F(1) —1, since F(1) 

The general theorem, referred to above, which does not assume the posi- 
h tiveness of the coefficients (and, in case of positive coefficients, is not restricted 
to those, very explicit, situations in which Stirling’s theorem is applicable) 
-runs as follows: 

Let F(n) be a multiplicative function for which the Dirichlet series 

co 

(4) f(s) =3F(n)/n® 
n=1 


is absolutely convergent in the half-plane o > 1 and represents there a func- 
tion acquiring a simple pole at the point s=1 (except for a vicinity of this 


il point, the function f(s) need not remain regular along the lineo =1). Then 
(i) tf the absolute value of F(p*) is less than a constant multiple of 

F pe*) for some fixed @< 1 and for every prime power p*, the series 
i- F(p)—1 

(5) 

is convergent ; in addition, 
, (ii) under the assumption of (i), the infinite product 
id 
re 

(6) (1—p*)(1 + 2F(p*)p*) 
eT D k=1 
' is convergent and its value ts the residue of f(s) at s=1; however, 
0 
2: (iii) if the common assumption of (i) and (ii) 1s satisfied for every 
" k ~ 2 but is relaxed from F(p*?) =O(p’) to F(p) =O(p) for k =2, then 
* the assertions of (ii) become false, not only because (6) may become divergent, 
he but also because (6) can converge to a value distinct from the residue; even 
a though the assumptions made before (i) are satisfied. 
on It is understood that p in (5) and (6) is supposed to run through the 
‘a sequence of all primes in increasing order. This proviso is necessary, since the 

convergence of (5) and (6) is not in general absolute. 
nt: 2. The following proof of (i) will depend on M. Riesz’s extension (to 
nd Dirichlet series) of Fatou’s theorem (on power series). Since this extension 


can further be generalized so as to involve just a Fourier condition, rather 


276 AUREL WINTNER. 


than regular-analyticity, near the point t 0 on the boundary line s = 1 + it 
(cf. [4]), the full force of the assumption, according to which the function 
(4), where o > 1 ins~—o~- it, is the sum of const./(s—1) and of a func- 
tion which is regular-analytic at s =1, will not. be needed. The particular 
case sufficient for the proof of the above formulation states that, if a Dirichlet 


series 
(7) a(n) /n® 
n=1 


converges in the half-plane o > 1 to a function which remains regular-analytic 
at the point s = 1, then the (trivial necessary) condition 
n=1 
is sufficient for the convergence of the series (7) at the point s=1. 
First, it is clear from the assumption @< 1 of (i) that (4) possesses the 


absolutely convergent Eulerian factorization 
50 
(9) f(s) =U (1 + 3 F(p*)/p*) 
p k=1 


in the half-plane o > 1 (in fact, even 6= 1 is sufficient to this end). If this 
is applied to the case F(1) =F (2) =---=—1, in which f(s) becomes 
Riemann’s {(s), it follows, by subtraction from the logarithm of (9) itself, 
that 

f(s) 
f(s) 
holds if o is sufficiently large (the logarithm refers to the determination 
which tends to 0 aso—o). Ifa is sufficiently large, then, by absolute con- 


vergence, the expression on the right of (10) can be rearranged into a Dirichlet 


(10) log 2 


= > 
Dp 


series (7) in which 


(11) a(n) =O unless n= p*. 


It is clear from the assumption of (i) that each of the remaining coefli- 
cients, a(p*), of the Dirichlet series of the logarithm (10) has an absolute 
value less than a constant multiple of the (@+ ¢«)-th power of p*", for every 
fixed « >0. Since 6< 1, it can be assumed, by choosing a 6 somewhat greater 
than the given 0, that a(p*) is majorized by a constant multiple of p?*”, 
where 6< 1. If this is compared with (11), it follows that the Dirichlet 
series (7) is absolutely convergent in the half-plane o > 1, and that the sum 


q 


at 
ion 
ne- 
lar 
ilet 


ytic 


the 


his 
nes 
elf, 


ion 
on- 


ilet 


A FACTORIZATION OF DENSITIES OF IDEALS. 277 


of those of its terms which do not belong to primes (i.e., in which n = p*, 
where k > 1) is absolutely convergent at the point s = 1. 

On the other hand, since (7) is identified with (10), the sum of those 
terms of (7) which do belong to primes is identical with the first series on the 
right of (10), a series which becomes the series (5) at the points=1. But 
the logarithm on the left of (10) remains regular at s 1, since the Dirichlet 
series (4), where o > 1, is supposed to represent a function having a simple 
pole at s = 1. Consequently, the convergence of the series (5) will be proved 


if it is shown that (8) is satisfied in the present case. 
Since a(p") is majorized by a constant multiple of p’*”, it is seen from 
(11) that (8) will be ascertained_if it is verified that the sum 
is o(x) as x» «©. In order to verify this estimate, let the latter sum be 
rearranged into 
> > 
k=1 


where p is the summation index of the interior sum. Since there are just 
O(log x) values of k for which x’ exceeds at least one p, the upper limit, 0, 
of the exterior summation can be replaced by O(log xv). On the other hand, 
since the number of primes not exceeding JN is less than a constant multiple 
of N/log N (Chebyshev), the number of terms in the k-th interior sum does 
not exceed a constant multiple of ka’“/log z. Finally, the greatest term of 
the k-th interior sum is less than the 6(4—1)-th power of x’. Consequently, 
the sum (8) is majorized by a constant multiple of 


O(log ax) 
k=1 


Clearly, the first factor, k, of the k-th term of this majorant can be 
omitted if @ is replaced by a somewhat greater 6. Hence, all that remains to 


be shown is that, as 2 —> ©, the estimate 


O(log x) 
k=1 


holds for every fixed @< 1. But this is obvious. In fact, if @ were 1, the last 


sum would be 
O(log a) O(loga) 
4 a, / le Ale 
k=1 k=1 


278 AUREL WINTNER. 


which is tO (log z)/logz = O(x). And this O becomes an o if is 
replaced by @< 1. 
This completes the proof of (i). 


4. The identity (10) resulted from a simultaneous logarithmization of 
(9) and of the corresponding factorization of the Riemann zeta-function, 
namely, of the infinite product 


f(s)/£(s) (1—p*) (1+ 3 


where o > 1. Let the factors of this infinite product be denoted by 1+ cp(s). 
Then 


(12) f(s)/é(s) (1 + (s)), 
where o > 1 and 
(13) Cp(S) = %(s) + Bp(s), 


if %(s) and B,(s) are abbreviations for 


(14) — (F(p) —1)/p* 
and 
(15) Bo(s) — (F(p*) — 


Since (s—1)f£(s) 1 as s—>1, the assertion of (ii) is equivalent to 
. the statement that (12) remains valid at the point s —1, if the value attained 
at s = 1 by the quotient on the left of (12) is meant to be the residue of f(s) 
at s = 1. In other words, (ii) will be proved if it is shown that the infinite 
product (12) is convergent on the closed half-line s = 1 and represents there 
a function which is continuous (at s—=1). In view of (iii), the second of 
these assertions is independent of the first. 

Since the case k 1 of the assumption of (i) means that F(p) is a 
bounded function of p, it is clear from (14) that 


(16) Max | a)(s) |? << 0. 


p 158 


Similarly, the series (15) is majorized by a constant multiple of 


Qp9(k-1) <= 9 > 
2 k=2 


M8 


k=2 


A FACTORIZATION OF DENSITIES OF IDEALS. 279 


if s21. As p—o, the sum of the last series remains less than a constant 
multiple of its first term, which is p**. Since @2—2 < —1, it follows that 
= | Bp(s)| is majorized for s= 1 by a convergent numerical series, and so 


(17) Max | Bp(s)| < 
p 1358 

In particular 

(18) Max | Bp(s) |? < 
p 1358 


According to a standard convergence criterion of Cauchy, an infinite 
product satisfying * << is convergent if and only if the 
series Xcp is, and a corresponding criterion holds for uniform convergence. 
But (18), (16) and (13) imply, by Schwarz’s inequality, that the series 
=| cp(s)|? is uniformly convergent for s=1. Hence, the product (12) is 
uniformly convergent for s = 1 if and only if the same is true of the series 
Scp(s). In view of (13) ig (17), this will be the case if and only if the 
series Xa,(s) is uniformly convergent for s=1. Finally, (14) shows that 
3a)(s) is a Dirichlet series which at s 1 becomes the series (5), the con- 
vergence of which is assured by (i). Since a Dirichlet series which is convergent 
at s = 1 must be uniformly convergent for s=1 (Abel-Jensen), it follows 


that Scp(s) is uniformly convergent for s = 1. 

Consequently, the product (12) is uniformly convergent fors=1. This 
is slightly more than what was needed for the completion of the proof of (ii), 
which was seen to depend on whether or not the product (12) converges for 


s=1 to a continuous function. 


5. What concerns (iii), the possibility of a divergent product (6) 
obvious under the assumptions of (iii). The remaining statement of (iii) is 
that the product (12) may converge on the closed half-line s= 1 to a function 
having a discontinuity of the first kind at s 1, if the assumptions of (iii) 


are satisfied. 

An example proving this possibility results by choosing the multiplicative 
function F(n) as follows: F(p*) = p* or F(p*) =0 according as k is even 
or odd. Then all assumptions of (iii) are satisfied. Furthermore, since 


ioe 
k=1 k= 
every factor of the product (6) is 1. Hence, all that remains to be shown is 
that the function f(s) has at s=1 a simple pole the residue of which is 
distinct from 1. But (9) shows that, if o > 1, 


| 


280 AUREL WINTNER. 


f(s) (1 + 3 (1 — 
Dp k=1 p 


which means that f(s) can be obtained by substituting 2s—41 for s in the 
factorization of Riemann’s {(s). And the residue of f(s) ={(2s—1) at 
s =1 is distinct from 1, since (s—1){¢(s) 1 and ((2s—1) —1)/(s—1) 
—> 21 as 1. 

It should be mentioned that a similar example can be derived from a 
consideration of Hardy [2] concerning the discontinuity of certain expressions 


involving Ramanujan sums. 


6. The theorem italicized in the Introduction will now be deduced 
from (ii). 

Dedekind’s zeta-function of an algebraic number field & is defined, if 
o> 1, by 
(1— (Np)*)”, 


where p ranges through all prime ideals of and Np is the norm of p. If the 
factors occurring in this product are arranged in the order corresponding to 

the principal ideals [|p], where p is a rational prime, it is seen that, if o > 1, 
(19) (s;R) =I (1— p*)*- (1 — 


where p runs through the sequence of all rational primes and g,j denote the 
positive integers defined in (2), (1). 

In the definition of the function (1) of the rational prime p, the role of 
the restriction to distinct prime ideals is clear from the fact that the number 
of all prime ideals dividing the principal ideal [p] is independent of p. In 
fact, if 
(20) lL =1(p) (p) 21 


denote the multiplicities of the j distinct prime ideals which divide [p] and 
have the respective degrees (2), then the number of all distinct prime ideals 


dividing [p] is the sum 

(21) + Liew (P) Gi (P); 

which is just m for every p, if m denotes the degree of R. In addition, 
(22) g.(p) +--+ 9im(p) =m for every p> a, 


if d is the discriminant of ®. In fact, each of the j multiplicities (20) 


ne 
to 


of 


er 


ils 


0) 


A FACTORIZATION OF DENSITIES OF IDEALS. 281 


occurring in the decomposition (21) of m is 1 unless p* divides the dis- 
criminant. This follows from the classical lemma according to which the 
factorization of a principal ideal [p] into prime ideals contains a multiple 
factor (if and) only if the rational prime p is a divisor of the discriminant. 

Thus it is clear that, if (19) is identified with (9), then the absolute 
value of F(p*) is less than a constant multiple of p’*~ for every fixed 
§6—e> 0. It follows therefore from (ii) that the product (3), which results 
by inserting the factors 1— p™ into the case s 1 of the product (19), is 
convergent, and that its value is the residue of (s;8) at s=1. Since this 
residue is identical with the asymptotic density (Dirichlet-Dedekind), the 


proof is complete. 


7. What concerns (i), it was mentioned in the Introduction that, in the 
particular case (19) of (9), the convergence of the series (5) may be obtained 
from Mertens’ theorem for 8. The actual content of (i) admits, in the -present 
case, the following interpretation: 

With reference to a fixed algebraic number field 8, let a rational prime p 
be called “normal” if the number of those distinct prime ideals of first degree 
which occur in the factorization of the principal ideal [p] is exactly 1. And 
let the remaining, or “abnormal,” rational primes (if any) be classified as 
“defective ” or “ excessive” according as that number is 0 or at least 2 (it can 
never be greater than m, the degree of &). For instance, every p is normal if 
R is the rational field. If m = 2, there are various senses in which it is mean- 
ingful to say that, as x —> ©, there are in the range 1 < p < z about as many 
primes p for which the discriminant of & is quadratic residue as primes p 
for which it is quadratic non-residue (one of the many possible formulations 
of this principle, which are of varying degree of analytical “depth,” was 
proved by Polya [5]). If &, instead of being (real and) quadratic, is an arbi- 
trary algebraic number field, one will expect a corresponding asymptotic balance 


between the two sets of abnormal primes. But what the convergence of the 


- series (5) means is precisely such a balance. 


In fact, let vp = v,)(&), where p = 2, 3,---, denote Kronecker’s index, 
defined as the number of those distinct prime ideals of first degree which 
divide the principal ideal [p] (so that vp is non-negative and cannot exceed 
the degree of ®). Then, if (19) is identified with (9), it is clear from the 
definition of the positive integers (1), (2) that /'(p) can be identified with vp. 
Hence, (i) means the convergence of the series 3(vp—1)/p (which must be 
arranged in the order of the monotone sequence of all rational primes). But 
the factor vy» — 1 which here multiplies the term 1/p of the divergent series 
x1/p is —1,0 or at least 1 according as p is defective, normal or excessive. 


e 
at 

) 

a 
18 
if 
1, 
1e 
= 
In 
1d 


282 AUREL WINTNER. 


Thus the convergence of the series 3(vp— 1)/p means for the average behavior 
of the factor vy» —1 a specific kind of asymptotic oscillation, representing the 


“balance” in question. 


Appendix. 


The integral-valued functions (1), (2), (20) of the rational prime p sub- 
stantially describe the laws of factorization of the integral ideals of an algebraic 
number field ®. Conversely, these laws of a given & determine the functions 
(1), (2), (20) uniquely. But fields 8 for which the mystery of these laws has 
been solved are scarce indeed. And the following considerations imply that, due 
to this purely algebraical mystery, the difficulties of anything like Riemann’s 
hypothesis in case of an arbitrary algebraic number field ® go much deeper 
than in case of a ® which is quadratic or cyclotomic (or, for that matter, 
rational). 

Let a degree-function mean an arbitrary assignment, for every rational 
prime p, of a positive integer (1) and of j —j(p) positive integers (2), sub- 
ject to the restriction that the resulting functions j,g of p satisfy (22) for 
some m = const., d = Const. 

Clearly, every degree-function determines an infinite product (19) which, 
by uniform convergence, represents a regular function in the half-plane o > 1, 
However, a degree-function is not in general such as to make the product (19) 
identical with the zeta-function of some algebraic number field R. In fact, the 
characterization of these “algebraic ” degree-functions presents an arithmetical 
existence problem, even the nature of which is quite obscure today. 

All that is clear is that the arithmetical restrictions in question impose 
function-theoretical limitations on the choice of the degree-function. For 
instance, a degree-function cannot belong to an algebraic number field unless 
the function represented by the product (19) in the half-plane o > 1 admits 
of an analytic continuation regular in the whole plane except for a simple pole 
at s=1. A further necessary condition is that the resulting meromorphic 
function be such as to satisfy a Riemann-Hecke functional equation. (Inci- 

dentally, it is not evident in itself that this second necessary condition is 
independent of the first.) However, the existence of a remains undecided 
even if a functional equation is satisfied. 

What actually happens is that, if the degree-function is chosen at random, 
then the corresponding Eulerian product (19), instead of defining a function 
meromorphic (or, at least, algebroid) in the whole s-plane, will possess a natural 
boundary ; in fact, this will be the case for “almost all” choices of the degree- 
function. The “almost all” refers to a Lebesgue measure naturally associated 


. 


vior 
the 


sub- 
raic 
ons 
has 
due 
per 


ter, 


A FACTORIZATION OF DENSITIES OF IDEALS. 283 


with the space of all degree-functions belonging to any fixed value of m in (22). 

In fact, consider an arbitrary disjunction of the sequence of all rational 
primes p into two complementary subsequences, and let a p be called a q or an r 
according as p is in the first or in the second subsequence. Thus 


(1—q*)"* 0 (1—r*)* (1— 
qd Tr D 


is Riemann’s €(s) and both products 


(23) n(s) =I (s)/n(s) (1—r*)* 
qd r 

represent non-vanishing regular functions in the half-plane o >1. It was 

shown in [6], p. 23, that the underlying disjunction can be so chosen as to 

make the line ¢ 1 a natural boundary for the first, and therefore for the 

second, of the functions (23). 


However, the “generic” case, characterizing “almost all” disjunctions, 
is that the first, and therefore the second, of the corresponding functions (23) 
remains regular on the line o—1 except for an algebraic branch-point of 
order 4 (instead of a pole of order 1) at s=1, and admits across the line 
¢=1 an analytic continuation which, except for possible algebraic singu- 
larities (the non-existence of which depends on the truth of Riemann’s 
hypothesis for the {(s) of the rational field), exists and is regular in the 
half-plane o > } but has the line o=4 as natural boundary; cf. [6], p. 27. 
In order to deduce from these facts the assertion italicized before (23), 

let, with reference to any given disjunction, a degree-function belonging to 
m= 2 in (22) be defined as follows: The value of the function (1) of p is 
chosen to be 2 or 1 according as p is a q or an r. This implies a unique 
assignment of the functions (2) also, if (22) is required to hold for m = 2 
(and for every p). It is clear from the identity 

q q q 
where o > 1, that the product (19) defined by the resulting degree-function 
can be written in the form 


qd 


and is therefore identical with 


£(s)n(2s), 


by (23) and by the identity preceding (23). 


ynal 
sub- 
for 
ich, 
19) 
the 
ical 
Ose 
For 
less 
its 
ole 
hic 
ici- 
is 
ded 
ral 
tel 


284 AUREL WINTNER. 


This completes the proof, since £(s) is meromorphic in the whole plane 
but »(2s) has sometimes the line o = 4, and almost always the line o = }, 
as natural boundary. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] R. Fricke, Lehrbuch der Algebra, vol. 3, Algebraische Zahlen, Braunschweig, 1928. 
[2] G. H. Hardy, “ Note on Ramanujan’s trigonometrical sum C,(n),” Proceedings of 
the Cambridge Philosophical Society, vol. 20 (1921), pp. 263-271. 

[3] A. E. Ingham, The distribution of prime numbers, Cambridge Tracts, No. 30 (1932). 
[4] J. Karamata, “ Weiterfiihrung der N. Wienerschen Methode,” Mathematische 

Zeitschrift, vol. 38 (1934), pp. 701-708. 
[5] G. Pélya, “ Ueber die Verteilung der quadratischen Reste und Nichtreste, G6ttinger 
Nachrichten, 1918, pp. 21-29. 
[6] A. Wintner, The theory of measure in arithmetical semi-groups, Baltimore, 1944. 
, “ The densities of the ideal classes and the existence of unities in algebraic 
number fields,” American Journal of Mathematics, vol. 67 (1945), pp. 235- 
238. 


THE FUNDAMENTAL LEMMA IN DIRICHLET’S THEORY OF 
THE ARITHMETICAL PROGRESSIONS.* 


By AUREL WINTNER. 


1. By a completely multiplicative function x = is meant any repre- 
sentation of the ordinary multiplication on the semi-group of all positive 
integers n, that is, any infinite sequence of real or complex numbers x(1), 
x(2),° °° satisfying the identity x()x(m) = x(nm) and the additional 
restriction that x(n) 0 holds for at least one n. Since the latter proviso is 
equivalent to x(1) 0 and also to x(1) = 1, it follows that the most general 
x(n) results by choosing its values, x(p), for primes, p, in an arbitrary 
manner and then placing . 

(1) x(n) y*(p) if p*, 
pin pin 
where x*(y) denotes the k-th power of x(p), and both products have the 
value 1 when they are vacuous, that is, when n = 1. 
Clearly, (1) is formally equivalent to the truth of Euler’s relation 


(2) x(n) (1 —x(p)/p*)” 

n=1 
(as an identity in s). The italicized reservation is necessary, since the series 
(2) and/or the product (2) can diverge for every s. It may be mentioned that 
this will not be the case if and only if the values x(p) determining the repre- 
sentation x(n) are so chosen that there exists a sufficiently large C satisfying 


(3) |x(p)| < 


for every p. Then (2) holds, by absolute convergence, in the half-plane 
o>C+1, where s=o+ tt. 

A classical instance of (1) is supplied by any of the ¢(m) residue char- 
acters (mod m). Any of these particular functions x of n has the period m 
and, except in case of a principal character, the mean-value 0 over a period. 
This implies that L(s), the Dirichlet series on the left of (2), is convergent 
in the half-plane « > 0. In addition, all non-vanishing values of x become of 
absolute value 1, that is, 


* Received December 10, 1945. 


1e 
8. 
of 
le 
or 
ic 
285 


286 AUREL WINTNER. 


(4) | x(p) |? =| x(p)| 
holds for every p. 

If x(”) is a complex character (mod m), the non-vanishing of L(1) is 
trivial ; cf., e. g., [1], p. 171. On the other hand, if x(m) is a (non-principal) 
real character (which, by (4), means that x(p) is capable of the three values 


(5) x(p) =—1,0,1 


only, where —1 actually occurs), then the non-vanishing of L(1) is the 
central fact in Dirichlet’s proof for the existence of an infinity of primes in 
each of the ¢(m) arithmetical progressions h, h + m, h + 2m,---, where 
(h,m) =1. Although it is undoubtedly impossible to conclude from Dirichlet’s 
theorem on the progressions that none of the real non-principal values L(1) 
vanishes, (which alone would prove the necessity of an analytical proof), the 
theorem on the progressions has never been reached by an approach distinct 
from Dirichlet’s route. In fact, all the known variants differ only in variations 
of the proof of the fundamental lemma, L(1) +0, of Dirichlet’s theory. 


2. A short account of these variants may be read on pp. 169-172 in [1]. 
The literature quoted by Hecke can be completed by referring to a more recent 
device of Ingham [2], further developed by Rankin [4]. Ingham’s method, 
adjusted to complex-valued representations x(n) satisfying (4), consists in 
taking suitable combinations of complex-conjugates and then applying the 
functional-theoretical argument used by Landau [3] in the case (5). This 
elaborate approach has far-reaching advantages (for instance, it leads to a 
variant of the Hadamard-de la Vallée Pousin proof of (1 -+ 1) #0 for 
—«o<t< ©). But in the case of real-valued representations x(n), it does 
not seem to apply to cases distinct from (5), since it depends on (4), that is, 
on the assumption that all non-vanishing values of the (real or complex) 
function x(”) be of absolute value 1. 

However, it is suggested by a recent consideration ([5], p. 68, where 
f’(n) corresponds to the present x(n), and f(n) to the coefficient of 1/n* in 
the Dirichlet series of {(s)L(s); cf. (18) below), that the true condition 
belonging to a real representation x(n) has nothing to do with the sharp 
arithmetical restriction (5), but merely with the assumption 


(6) —1= x(p) < 


(for every p), which is qualitative in nature. Since this unilateral restriction 
does not imply the convergence of the Dirichlet series L(s) for some s, an 
additional convergence condition must then of course be required. But this 
additional restriction disappears if (6) is particularized to the generalization 


i 


287 


DIRICHLET’S THEORY OF THE ARITHMETICAL PROGRESSIONS. 


(7) 51 


of the classical case (5). And the case (7) is of fundamental arithmetical 
interest, since, as seen from (1), a real-valued representation x(n) of the 
multiplication is a bownded representation if and only if its data x(p) satisfy 
(7) for every p. 

It turns out that the extension of Dirichlet’s theorem L(1) ~0 from 
the classical case (5) to the case (7), and even for the more general case of 
(6), is actually possible. The proof will depend on an observation the rough 
content of which is to the effect that the rdles of the L-series (2) and of the 


Riemann zeta-function 


(8) =I (1 — 1/p*)” (o > 1) 


can be interchanged in the Landau-Ingham treatment of the case (4). Corre- 
spondingly, what is essential in the analogous situation considered .in [5], 
p. 68, is not the sharp restriction (5), but merely the assumption that the 
coefficient 1+ a)—1-+y(p) on the right of the formal identity 


(9) (1 + dp/pt) (1—1/pt)* = 1 +3 (1+ ay) 


where a) = x(p) and o > 0, be non-negative for every p. But this is precisely 


the assumption (6). 


3. It will be convenient to defer the treatment of the general case (6), 
which is complicated by the necessity of an additional convergence assumption 
but is otherwise not different from the treatment of the particular case (7). 
In the latter case, the theorem is as follows: 


(*) Let x(n) be a real, bounded representation (1) of the ordinary 
multiplication on the semi-group of the positive integers. Suppose that the 
regular function L(s) defined by the Dirichlet series 


(10) L(s) =3x(n)/n® 


on the half-line s > 1 admits across the point s =1 an analytic continuation 


which remains regular on the segment 


(11) 
Then 
(12) L(1) >0 


or, what amounts to the same thing, L(1) 09. 


a 
|| 
Ts 
le 
n 
; 
) D 
1S 
it k=1 
| 
ef 
is 
a 
r 
n 
n § 
n 3 
is 


288 AUREL WINTNER. 


The regularity of L(s) on (11) is essential, since L(1) =0 becomes 
possible as soon as (11) ts replaced by 


(13) $<sS1. 


The insufficiency of (13) is proved by the standard case of Liouville’s 
function, x(v) —A(n). In this case, (1) is satisfied, as is the boundedness 
condition (7), since A(p) =—1 for every p (incidentally, this is just the 
extreme case admitted by (6) alone). Hence, if s is replaced by 2s in (8), 
it is seen from (2) that the function (10), where s > 1, now becomes 


L(s) =U (1 + 1/p*) =€(2s)/£(s) 
Dp 


and is therefore regular on the segment (13). Nevertheless, L(1) = 0, since 
in €(2s)/£(s) has a pole at s = 1 but £(2) #0. 

In the sequel, only (11) will be considered. 

First, the boundedness of x(n), which is equivalent to (7), implies that 
both the Dirichlet series (10) and its Eulerian factorization (2) are absolutely 
convergent in the half-plane o > 1. Furthermore, it is clear from (7), and 
even from the more general assumption (6), that every factor of the product 
(2) is positive when s > 1 (simply because 1/p* <1). Since the function 
L(s), being regular for s= 4, is continuous at s=1, it follows that the 


assertion (12) is equivalent to negation of 
(14) L(1) =0. 
4. It is also clear from (7) and (1) that the Dirichlet series 
x 
(15) J(s) =X 


where x?(n) denotes the square of x(n), is (absolutely) convergent on the 
half-line s > 1 and admits there the Eulerian factorization 


(16) J(s) (1— ?(p)/p*)”. 
p 
If s in (16) is replaced by 2s, it is seen from the factorization (2) of L(s) 
that 
L(s)/J (2s) = (1 + x(p)/p*) 


when s>1. Hence, if B(n) is defined by 


(11) L(s)/J (28) = 3 B(n)/n', 


the 


DIRICHLET’S THEORY OF THE ARITHMETICAL PROGRESSIONS. 289 


then, from (8), 
(18) (1+ x(p)/p") (1 —1/pt)* = 3 B(n)/nt. 


Finally, if the case a—xyx(p) of the identity (9) is combined with the 
assumption (7), it is seen from (18) that 


(19) B(n) =0 
holds for every n and, of course, 
(20) B(1) =1. 


Suppose that (14) is true. 

Since the function L(s) is supposed to be regular on the segment (12), 
and. therefore on the half-line s = 4, and since the zero of L(s) at s=1 
absorbs the pole of ¢(s) in the product ¢(s)Z(s), it follows that this product 
is regular for s= 4. On the other hand, since (16) represents a non-vanishing 
regular function on the half-line s > 1, the function 1//J(2s) is regular for 
s>+4. Consequently, the function on the left of (17) is regular for s > 4. 
It follows therefore from (19) and from Landau’s extension [3] of the Vivanti- 
Pringsheim theorem to Dirichlet series, that the Dirichlet series on the right 
of (17) is convergent for s > 4. And, for reasons of analyticity, the sum of 
this Dirichlet series must be the function on the left of (17), if s > 4. 

It turns out that this contains a contradiction. In order to show this, 
the alternative possibilities represented by 


(21) =x°(P)/p < 

and 

(22 =x?(p)/p=o 
D 


will have to be disposed of separately. 


5. Consider first the possibility (21). In this case, it is clear that the 
sum of the reciprocal values of those primes p which satisfy the inequality 
|x(p)| = 4 is a convergent series, since each of the corresponding terms of 
(21) is minorized by the reciprocal value of 4p. But it will now be shown that 
(whether (21) or (22) be the case) the sum of the reciprocal values of those 
primes p which satisfy the complementary inequality, | x(p)| < 4, must also 


ss 
he 
ce 
1at 
ely 
nd 
uct 
ion 
the 


290 AUREL WINTNER. 


converge. Hence, (21) will lead to the contradiction 31/p < ©, where p runs 
through all primes. 

First, since the Dirichlet series on the right of (17) is convergent for 
s > 4, and therefore for s = 1, it follows from (19) that this Dirichlet series 
is absolutely convergent for s 1. On the other hand, since (15) is absolutely 
convergent for s > 1 and therefore for s = 2, the Dirichlet series of J (2s) 
is absolutely convergent at s 1. It follows therefore from (17) that €(s) Z(s) 
is the product of two ordinary Dirichlet series both of which are absolutely 
convergent at s 1. Hence, the same is true of the product series. In par- 
ticular, if cn denotes the coefficient of 1/n* in the Dirichlet series of {(s)Z(s), 
then =| cp |/p << ©, where p runs through all primes. But it is clear from 
(8) and from the factorization (2) of L(s), where s > 1, that cp =1-+ x(p). 
Accordingly, 


x(p)\/p< o. 
D 


However, this implies that the sum of the reciprocal values of those primes p 
which satisfy the inequality | x(p)| <4 is a convergent series. In fact, 
!1-+-y(p)! is then minorized by 3. 
This completes the proof for the impossibility of (21). The refutation of 
the remaining possibility, (22), is quite different. It proceeds as follows: 
Since the Dirichlet series on the right of (17) converges to the function 
on the left of (17) if s > 4, it is clear from (19) and (20) that 


¢(s) L(s) = J(2s) 
if s>4. Since (16) is valid for s > 1, it follows that 
¢(s) L(s) = 1/M(1— *(p)/p*) 
D 


if s >4. Hence it is seen from ,?(p) 20 that (22) implies the relation 


2 


lim inf {(s)L(s) =o. 
3—4+0 
But this relation contradicts the assumption (which has not been used thus 
far) that the function Z(s) is regular not only on the segment (13) but on 
the closed segment (12) as well. 
This completes the proof of (*). 


Remark. The assertion of (*) remains valid-if the function L(s), 


of 


on 


hus 
on 


DIRICHLET’S THEORY OF THE ARITHMETICAL PROGRESSIONS. 291 


instead of being regular on the segment (12), is regular just on the segment 
(13), but does not tend to © ass—>4-+ 0 (which does not even require that 
L(s) < const. as s>3-+0). 


In fact, all that is needed is that the assertion of the last formula line be 
contradictory. 


6. It is now easy to see that the assumption (7) of (i) can be generalized 
to (6), provided that a convergence condition is satisfied. 


(**) Leta representation (1) of the ordinary multiplication on the semi- 
group of all positive integers n be such as to satisfy the unilateral restriction 
x(p) 2—1 for every prime p. Suppose that the Dirichlet series 


co 
J(s) = 3 
n=1 


where x*(n) denotes the square of x(n), is convergent fors >1. Then the 
same is true of the Dirichlet series 


oe 
L(s) x(n) 
n=1 
Suppose that the function L(s) admits across the point s=1 an analytic 
continuation which remains regular for }SsS1. Then L(1) #0 (or, 
equivalently, L(1) > 0). 


In fact, the convergence of the Dirichlet series J(s) for s > 1 implies, 
by Schwarz’s inequality, the absolute convergence of the Dirichlet series L(s) 
for s > 1. Hence, the Eulerian factorizations (2), (16) hold, by absolute 
convergence, for s > 1. Since x(p) =—1, none of the factors occurring in 
these factorizations of L(s),J(s) becomes meaningless (i. e., of the form 07). 
Correspondingly, the proof of (**) is exactly the same as was that of (*). 
In fact, the various steps in the proof of (*) were purposely referring to (6) 
rather than to (7). 


tEMARK. The whole proof is such as to leave little doubt that the value 
—1 of the absolute constant occurring on the right of the assumption 


x(p) =—1 of (ii) is as sharp as possible. That some lower limitation 
x(p) = —@ is necessary, is shown by the example x(7) defined by x(2) =— 2, 
x(3) = x(5) =- - - 0. In fact, (14) is true in this case, since the factori- 


) 
: 
m 
). 
(s), 


292 AUREL WINTNER. 


zation (2) shows that L(s) becomes identical with the function 1— 2%, 
Thus @ = 2 is not an admissible value of @ in x(p) =— 8. But all that fol- 
lows in this trivial manner does not disprove the (unlikely) possibility @ > 1, 
In this regard, constructions of the type considered in [6] seem to be relevant. 


THE JOHNS HopKINS UNIVERSITY. 


REFERENCES. 


{1] E. Hecke, Vorlesungen tiber die Theorie der algebraischen Zahlen, Leipzig, 1923. 

[2] A. E. Ingham, “ Note on Riemann’s zeta-function and Dirichlet’s L-functions,” 
Journal of the London Mathematical Society, vol. 5 (1930), pp. 107-112. 

[3] E. Landau, “ Uber einen Satz von Tschebyschef,” Mathematische Annalen, vol. 61 
(1905), pp. 527-550. 

[4] R. A. Rankin, “ Contributions to the theory of Ramanujan’s tau-function and simi- 
lar arithmetical functions, I,” Proceedings of the Cambridge Philosophical 
Society, vol. 35 (1939), pp. 351-356. 

[5] A. Wintner, Fratosthenian Averages, Baltimore, 1943. 

[6] ———, “The singularities in a family of zeta-functions,’ Duke Mathematical 


Journal, vol. 11 (1944), pp. 287-291. 


( 
r 
th 


ical 


ASYMPTOTIC INTEGRATION CONSTANTS IN THE 
SINGULARITY OF BRIOT-BOUQUET.* 


By AUREL WINTNER. 


1. The following considerations deal with the problem of the real Briot- 
Bouquet equation 
(1) xy’ = py + qe + 


where the prime denotes differentiation with respect to x, the coefficients p, q 
are constants and (2, y¥) represents terms which, in a sense to be specified, 
are of a higher order than the linear form py + qz, as (x,y) (0,0). The 
connection with Poincaré’s problem 


(2) z=ax-+by+f(z,y), y=cx+dy+q(z,y), 


where the dots denote differentiations with respect to a time variable t (> o), 
the constant matrix 


a b 
(3) 


has a non-vanishing determinant and f(z, y), g(x,y) represent the ‘ 


*small” 
non-linear perturbations, is as follows: 

The geometrical problem concerning the behavior of the solution paths of 
(2) near (x,y) = (0,0) naturally splits into two main types, according as 
the characteristic numbers of (3) are or are not real and of the same sign. 
The second type comprises three subcases, since the characteristic numbers 
can then be real and of opposite signs (saddle), complex but not purely 
imaginary (vortex) or purely imaginary (center or vortex). The problem to 
be treated does not arise in any of these subcases of the second type, treated, 
under very general assumptions, in the second part of Perron’s paper [4] 
(correspondingly, all of the following references to [4] will quote its first 
part only). In fact, the problem to be considered is one concerning a node, 
represented by the first type, that is, by the cases in which the characteristic 
numbers of (3) are real and of the same sign. This type, too, comprises three 
subcases, since the characteristic numbers can now be distinct or not and, if 
they are not distinct, the elementary divisors can be distinct or not (the corre- 


sponding normal forms of (3) are 


* Received December 14, 1945. 


ol- 

nt. 

18,” 

mi- 
= 

293 


294. AUREL WINTNER. 


ED: 


where 0 <7 1). And the-connection referred to before (2) consists in the 
formal fact that (2) can be reduced, in all three nodal cases (4), to (1), 
where p > 0 (and, without substantial loss of generality, 


(4 bis) p=1,¢q=0; p>l1,q=0; p=1,q=—1 


respectively ) ; cf., e. g., [4], pp. 140-146. 

The six figures, illustrating the three possibilities in the three subcases 
(saddle, vortex, center) of the second type and the three distinct kinds of 
nodes of the first type, may be found in [2], pp. 101-102. The three nodal 
figures are given also in [4], p. 123. For references to the classical literature 
of the subject, cf. [3], pp. 215-216, 219-220, 227-228. The more recent papers 
of Frommer [1] and Weyl [5] do not deal with the problem to be considered. 


2. The problem in question concerns the determination of the paths in 
a nodal sheaf by the asymptotic slope (or its equivalent) as an arbitrary inte- 
gration constant. In all three cases (4 bis), this problem will be treated as an 
application of a general theorem on asymptotic equilibria (cf. [6]), including 
a continuity theorem of Siegel (ibid., p. 131) concerning such equilibria. 
When formulated for the relevant case of a single differential equation, these 
facts can be stated as follows: 
In a half-plane 


om 
or 
— 


let f(u,v) be a real-valued, continuous function satisfying 


fo. 


(6) f(w,0)| du< 


and having the property that, if u is large enough and v’ and v” are arbitrary, 


the Lipschitz condition 
(7) | f(u, —f(u, v”)| SA(u) |v’ —v" | 


is satisfied by a (non-negative and, for instance, continuous) function A for 


which 


co 


. 
= 
4 
= 


Ses 

of 
dal 
ure 
ers 


‘ed, 


an 
an 
ing 
ria. 


1ese 


ary, 


: for 


ASYMPTOTIC INTEGRATION CONSTANTS. 295 


Then, if v° is given arbitrarily and if u°(> uo) is any value greater than a 
3 
lower bound depending on the given value of v°, the differential equation 


(9) dv/du = f(u, v) 


and the initial condition v(u°) =v° determine a solution v —v(u) which 
exists on the whole half-line uw®° << u< «. Furthermore, there exists a finite 
limit, v(«), for this solution. Conversely, if ¢ is any real number, there 
exists a solution v = v(wu) corresponding to which the limit v(o) attains 
the given value c, and this solution is uniquely determined by c. Jn addition, 
the correspondence between c and the solution v—v(u) determined by 
is continuous. 

| A corresponding theorem holds for the case in which (9) is replaced by 
a system, that is, in which v,f are vectors. Theorem (ii) in [6] extends this 
to the case in which the asymptotic equilibria, v( co) —c, are replaced by 
small oscillations about such equilibria. This extension depended on the 
method of the variation of constants in the particular case of small oscillations. 
Correspondingly, a further extension results if, for the case of inhomogeneous 
linear systems, the method of the variation of constants is formulated in its 


general form, as follows: Let A = A(t) be a matrix of n times n continuous 
functions on a ¢-interval and let a(t) be a vector of n continuous functions on 
the same interval. Then, if X = X(t) is any fixed fundamental matrix of 
the homogeneous linear system = (where = and 


a’ = dx/dt), that is, if Y(t) is a matrix the n columns of which are n linearly 


independent solution vectors 7(t) of a —A(t)z, then the general solution 


of the inhomogeneous linear system y’ = A(t)y + a(t) is the vector 


y(t) = X(t) (a+ f (t)a(t)dt), 
to 
where @ is an arbitrary constant vector, t) is any fixed point of the ¢-interval 
(it is understood that the reciprocal matrix, X~*(¢), exists for all ¢, since 
det X(t) 0 for all ¢). The verification requires nothing but a differentia- 
tion, since X’(¢) — A(t) X(t) is an identity in virtue of 2’(t) = A(t)z(t).] 


3. The prototype of (1) is the differential equation in which the higher 
terms, represented by $(2z,y), are missing; so that 


(1 bis) ry’ ty=py (p>1); ry =yt+e 


in the respective three cases (4 bis). 
In the first case, ry’ = y, the general solution is y(x) = cz, where the 


4 


296 AUREL WINTNER. 


constant c is arbitrary. For the corresponding perturbed equation (1), the 
following theorem will be proved: 


(i) Ona rectangle 
(10) 0<2< 4, —b<y<b, 
let o(2z,y) be a real-valued, continuous function, subject to the restriction 


(11) f | (2, 0)| dr << 


+0 


and, for sufficiently small x > 0, to the Lipschitz condition 


? 


(12) | (x, — y2)| p(zx)| — ye 


where p(x) is defined (and, for instance, continuous) for small x >0 and 
satisfies 


+0 


Then, if aand b in (10) are chosen small enough, the behavior of the solutions of 
(14) ay’ =y + o(2,y) 


can be described as follows: Every solution path y=y(x) of (14), issuing 
from an arbitrary point (Xo, Yo) of the rectangle (10), exists on the whole 
interval 0 << @ < 2% and tends to the origin, (0,0), in such a way that there 


exists a constant c = C(Xo, Yo) satisfying 
(15) ~cx as t>+0 


(which should mean y(x) =o0(x) if c=0). Conversely, if the (real) value 
of the integration constant c is assigned arbitrarily, there belongs to it a 


unique solution path y= y(x) satisfying (15). 


It is instructive to contrast this theorem with the corresponding result of 
Perron (Satz 4 in [4], p. 132). He assumes that $(z, y) is continuous on the 
closure of the open rectangle (10), which of course is a serious restriction of 
the character of the singular point, (0,0). On the other hand, instead of 
assuming a Lipschitz condition (12) and the average restrictions (13) and 


(11), Perron requires the existence of a constant 0 > 0 satisfying 


(16) as r= (a? + 


i 


Lé 


ASYMPTOTIC INTEGRATION CONSTANTS. 297 


Since (15) is a condition at the origin (0,0), it does not imply anything 
like a Lipschitz condition. Correspondingly, it cannot ensure that (14) has 
just one solution path through a point of the open rectangle (10). Accord- 
ingly, the assertion of Perron’s theorem is that (14) has at least one (rather 
than, as in (i), exactly one) solution path satisfying (15), if ¢ is given arbi- 
trarily. But the explicit assumption of a fixed 6 > 0 in (16), no matter how 
drastic, seems to be essential in the proof of Perron’s theorem. On the other 
hand, the integral assumptions of (i) prove to be the best possible restrictions 
of their kind. This can be seen as follows: 

Suppose that ¢(z,y) in (14) is independent of y, that is, let 


(17) ty =y+ 


where ¢(z) is continuous for small x > 0 (as will be seen in a moment, it is, 
this time, beside the main point that, so far as (i) is concerned, > 0 could 
be replaced by x= 0). Since (12) and (13) are now satisfied (by »=0), 
the assumptions of (i) reduce to (11) alone, that is, to the absolute convergence 


of the improper integral 


(18) f dz. 


+0 


In contrast, Perron’s assumption (16) requires that (18) be convergent for 
the drastic reason that the function integrated in (18) becomes o(2~*) for 
some «<1. On the other hand, if ¢(x) > 0, the sufficient condition supplied 
by (1) turns out to be necessary for the truth of the assertions of (i). In fact, 
the absolute convergence of (18) is equivalent to the convergence of (18), 
if ¢(x) > 0. But the convergence of (18) is now necessary and sufficient for 
the truth of the assertions of (i), whether ¢(2) > 0 be satisfied or not. This 
assertion, which proves that (i) is of a final nature, can be verified as follows: 

Since the differential equation (17) is linear, it can be solved by a 


quadrature. This gives 


(18 bis) y(z) =2(c+ x*(x)dx), 


Zo 


where 2» is fixed and ¢ is arbitrary. But the representation (18 bis) of the 
general solution of (17) makes it clear that the convergence of (18) is neces- 
sary and sufficient for the truth of the assertion (15), where c is unspecified 
(in fact, the parenthetical remark following (15) requires the inclusion of 
C= 

Incidentally, all of this implies that, as verified by Perron ([4], p. 131) 


8 


g 

le 

f 

e 

yf 

d 

| 


298 AUREL WINTNER. 


by considering a'‘specific ¢(z) in (17), his assumption (16), where 1+ 6>1, 
cannot be relaxed to =o(r), where 


4. In the third of the cases (4 bis), the prototype (6 =0) of (1) is 
ry’ =y-+ 2. If x >0, the general solution of this linear differential equation 
is seen to be y(x) =a log x + cz, where c is arbitrary. The appearance of 
the logarithmic (or, on the standard exponential scale of dynamics, “ secular”) 
term agrees, of course, with the fact that the present case, being represented 
by the third of the matrices (4), is the case of a multiple elementary divisor, 

Corresponding to this form of the general solution of the linear prototype, 


the non-linear situation now turns out to be as follows: 

(ii) Suppose that $(2, y) satisfies the assumptions of (i), except that 
(11) ts replaced by 
(19) f | o(2, clog x)|dzr< 


+0 


Then the assertions of (i) remain true if 


(20) ry =y+2+ 
and 
(21) y(z) as 


are read instead of (14) and (15) respectively. 
If (20) is chosen to be of the particular type 
(22) =y+x+ $(z), 


the linear differential equation which now replaces (17), then the quadrature 
which corresponds to the solution (18 bis) of (17) in the case (22) proves 
that (ii) is of a final nature in the same sense as (1). 

What concerns Perron’s result in this case of a multiple elementary 
divisor (Satz 5 in [4], p. 133), his assumptions are, as before, the continuity 
of (x,y) on the closure of the rectangle (10) and the existence of a positive 
index @ satisfying (16). But his assertion is now very weak; in fact, not even 
the first approximation, y(z) ~ # log x, to (21), but merely the qualitative 
corollary, y(x)/x—> — ©, of this approximation is asserted. In contrast, (11) 
introduces c as an asymptotic integration constant and establishes a one-to-one 
correspondence between the field of the solution paths and the points of the line 
C< 

5. The remaining case is the second in (4bis). According to (1), the 


corresponding linear prototype is zy’ = py, where p> 1. If x > 0, the general 


he 


ASYMPTOTIC INTEGRATION CONSTANTS. 299 


solution of this linear differential equation is y(#) — ca? (whether p does or 
does not satisfy the assumption p > 1, except that 
(23) =o(z) as t>+0, if p>l, 
but not if p=1). Correspondingly, the non-linear theorem now turns out 
to be as follows: 

(i*) For every fixed p> 0, the assertions of (i) remain true tf (11) 
is replaced by 


4(z,0)| dz < 0, 
and (14) by 

(14*) ry’ = py + $(2,y), 

finally (15) by 

(15*) y(z) =ca?+o0(z) as 


This is a generalization of (i), since p—1 (and, for that matter, 
0< p<1) is allowed in (i*). If p>1, then (23) is a corollary of the 
assertion of the last formula line. ‘ 

Perron’s corresponding result (Satz 3 in [4], p. 129) concerns the case 
p> 1. Besides the continuity of (2, ¥) on the closure of (10), all that he 


now assumes is 


(16 bis) d(r,y) =o(r) as r= 


(that is, (16) in the limiting case 60, excluded in the preceding cases), 
but his assertion is just (23), a relation containing no integration constant. 


6. Since (i*) implies (i), it will suffice to prove (i*) and (ii). 

In both proofs, it can be assumed that b is 0 in (10). In fact, if (2, y) 
is given as a continuous function on a rectangle (10), it can be assumed to be 
extended to the corresponding strip 
(24) 0<r<a, 
in such a way that the respective assumptions of (i*) and (ii) remain satisfied. 
But, if the assertions of (i*) and (ii) are proved for the case in which (10) 
is replaced by (24), the assertions of (i*) and (ii) themselves follow, since 
these assertions only concern a sufficiently small rectangle (10). 

By the method of the variation of constants, (i*) and (ii) will now be 
reduced to the theorem quoted after (5). To this end, put 


(25;) r=e, y = ver 


in the differential equation, (14*), of (i*), and 


1, 

is 

yn 

of 

) 

ad 

e, 

at 
re 

es 

ry 

ty 
ve 

ve 

i) 

ne 

ne 

ral 


t 
: 


300 AUREL WINTNER. 


(252) y = (v—u)e* 


in the differential equation, (20), of (ii). Both (25,) and (252) transform 
the strip (24) into a (u, v)-domain which can be assumed to be the (schlicht) 
half-plane (5) of the theorem to be applied. Both differential equations then 
appear in the form (9), where, as easily verified, 
(26) f(u, v) = — vem) 
in the case of (i*). and 
(262) f(u, v) =— e“p(e, (v— u)e™) 
in the case of (ii). 
On the other hand, it is seen from {25,) and (25.) that the assertions of 
(i*) and (ii), those concerning (15*) and (21) respectively, become identical 
with the corresponding assertions of the theorem quoted in 2, that is, with 
those concerning v(o) —c. Consequently, all that remains to be ascertained 
is that the assumptions (6), (7), (8) of 2 are satisfied in the respective cases. 
What concerns (6), jt is clear from (26,) and (26), respectively, that 
the corresponding assumptions, (11*) and (19), of (i*) and (ii) are identical 
with (6). On the other hand, straightforward reductions show that the com- 
mon Lipschitz assumptions, (12) and (13), of (i*) and (ii) are precisely 
the Lipschitz assumptions, (7) and (8), of 2, whether f(u,v) be the function 
(26,) or the function (262). 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] M. Frommer, “ Die Integralkurven einer gewéhnlichen Differentialgleichung erster 
Ordnung in der Umgebung rationaler Unbestimmtheitsstellen,” Mathematische 
Annalen, vol. 99 (1928), pp. 222-272. 

[2] H. Liebmann, Lehrbuch der Differentialgleichungen, Leipzig, 1901. 

[3] P. Painlevé, “ Gewohnliche Differentialgleichungen; Existenz der Lisungen,” Ency- 
klopddie der mathematischen Wissenschaften, IIA 4a (1900). 

[4] O. Perron, “Ueber die Gestalt der Integralkurven einer Differentialgleichung 
erster Ordnung in der Umgebung eines singuliren Punktes,” Mathematische 
Zeitschrift, vol. 15 (1922), pp. 121-146 and vol. 16 (1923), pp. 273-295. 

. Weyl, “ Concerning a classical problem in the theory of singular points of ordi- 
nary differential equations,” Actas de la Academia Nacional de Ciencias 
Exactas, Fisicas y Naturales de Lima, vol. 7 (1944), pp. 21-60. 

[6] A. Wintner, “ Asymptotic equilibria,” American Journal of Mathematics, vol. 68 


(1946), pp. 125-132. 


[5] 


H 
i 


orm 
ht) 
hen 


ON THE ASYMPTOTIC BEHAVIOR OF THE SOLUTIONS OF 
A NON-LINEAR DIFFERENTIAL EQUATION.* 


sy HARTMAN and AvuREL WINTNER. 


1. The following considerations imply a simplified and, at the same 
time, generalized approach to certain of Poincaré’s qualitative results on the 
singularity (x,y) = (0,0) of the real (analytic) differential equation 


(1) vy = ae 0), 


and to the corresponding results of Bendixson on the non-analytic case of this 
differential equation and on 


(2) any’ ar + (BA0; m=1,2,-- 


(ef. Liebmann’s report [2], pp. 507-512, where further references will be 
found; for more recent literature, cf. p. 178 of Dulac’s monograph [1]). 

The most general result in the direction in question is that of Perron 
({3]; ef. [2], pp. 512-513), which concerns the differential equation 


(3) $(x)y =f(z,y) 


in which ¢(a), instead of being z or +.(as in (1) or (2) respec- 
tively), is any function which is continuous and positive on an interval 
0< 2a and satisfies the conditions 


(4) 
and 
(5) 0 as t>-+0, 


whereas f(z, y) is a real-valued continuous function on a-rectangle 
and is subjected there to the upper and lower Lipschitz conditions - 


(1) O<c<  (y* <y**), 


* Received January 8, 1946. 


301 


of 
eal 
ith 
ed 
eg, 
nat 
cal 
m- 
ely 
ter 
he 
ng 
he 
as 
68 
|| 


302 PHILIP HARTMAN AND AUREL WINTNER. 


and to the condition 
(8) f(0,0) =0. 


Perron distinguishes two cases, according as the sign of the absolute value 
can or cannot be omitted in (7). In the first case, he proves the existence of 
two positive numbers a’, b’ having the property that every solution path 
y= y(zx) of (3) issuing from a point of the rectangle 


tends to the point (0,0) as r—-+- 0. In the second case, he proves that there 
exists exactly one such solution path. 

Perron’s method of proof consists of an elaborate application of the process 
of successive approximations. The nature of such an analytical approach makes 
sufficiently clear the réle of the upper Lipschitz limitation in (7), that is, of 
the assumption of a constant C. Correspondingly, this condition is satisfied as 
soon as f(x,y) is sufficiently smooth (for instance, such as to possess a con- 
tinuous partial derivative with respect to y). In contrast, the lower Lipschitz 
in (7), that represented by the existence of a constant c > 0, is a serious 
qualitative restriction. In fact, it is clear from the continuity of f(z, y) that, 
if the lower limitation of (7) is satisfied, then f(z, y) is a strictly monotone 
function of y (for every fixed x), it being increasing in the first, and decreas- 
ing in the second, of the cases which represent Perron’s alternative, quoted 
above. But the converse is not true, since, even if f(z, y) is regular-analytic, 
the monotony of f(z,y) with respect to y does not imply the existence of a 
constant c > 0 satisfying (7) near the critical point, (0,0). 


2. Thus it appears unexpected that, for the truth of Perron’s alternative, 
both the upper and the lower Lipschitz limitations in (7) turn out to be 
entirely superfluous by virtue of that corollary of (7) which requires the strict 
monotony of f(z, y) with respect to y. What makes the methodical situation 
particularly striking is the fact that the proof of the resulting extensions 
(which apply to new cases even when f(x,y) is restricted to be regular- 
analytic) can be obtained by a transparent argument of purely geometrical 
considerations. In fact, the proof will require only an extension of the quali- 
tative procedure developed in [4]. A recourse to the analytical process of 
successive approximations proves, therefore, to be a ballast in every respect. 

Incidentally, it is superfluous to assume the restriction (5) which, in view 
of the assumption (4), involves a limitation of the asymptotic smoothness of 
the coefficient of y’ in (3) (the assumption (4), where ¢(x) > 0, being 
compatible even with 


ue 


th 


Te 


ng 


ASYMPTOTIC BEHAVIOR OF SOLUTIONS OF A DIFFERENTIAL EQUATION. 303 
(5 bis) lim sup = 


as c—> ©). Accordingly, the first case of the alternative to be proved can be 
formulated as follows: 


(i) Let d(x), where0 <x Sa, be a positive, continuous function satis- 
fying (4), and let f(x,y), where (x,y) is restricted to the rectangle (6), 
be a continuous function which satisfies (8) and is increasing with y when x 
is fived. Then (6) contains a rectangle (9) (even one with b’=b) having 
the following property: Every solution path y=y(zx) of (3) passing through 
a point (ao, Yo) of (9) can be continued for all positive r(< 2), and 


(10) y(t) > 0 as 
holds for all continuations. 


Such an oblique formulation (in terms of “ continuations”) is neces- 
sary, since, in contrast to the case of an upper Lipschitz limitation on f(z, y), 
a solution path y = y(x) of (3) can now easily lead to branch points, that is, 
to points (2, yo) through which there is more than one solution path (suffice 
it to say that (w,v) = (0,0) is not a point of uniqueness of the differential 
equation dv/du = v, although v® is an, increasing function of v). 

The second assertion to be proved is as follows: 


(ii) Suppose that $(x) and f(x,y) satisfy the assumptions of (i), 
except that f(x,y), instead of being increasing, 1s decreasing (in y) when « 
is fived. Then (3) has one and only one solution y= y(x) satisfying (10). 


The fate of all the other solution paths issuing from points of (6) is 
explained by the following refinement of (ii): If ¢(x) and f(a, y) satisfy 
the assumptions of (ii) and if y= y(z) is any solution path of (3) distinct 
from the unique solution path supplied by (ii), then the solution path cannot 
reach the y-axis within the rectangle (6) and must, therefore, reach one of the 
levels y = + b at a positive x = 2°, which is an integration constant. 

Under the assumptions of (i) or (ii), the function f(z, y), hence f(0, y) 
as well, is continuous and strictly monotone (in y) when —b SySb. The 
assumption (8) implies that f(0,y) is not zero if y 0 and so, by continuity, 
f(x,y) is not zero for sufficiently small x if y4 0. Furthermore, f(0,b) and 
f(0,—6) are of opposite signs and, again by virtue of the continuity of 
f(x,y), it is possible to choose a positive number a’( = a) so small that f(z, bd), 
f(z,—b) do not vanish when 0 = 2 =@’ and are of opposite signs. 


of 
eg 
of 
as 
n- 
tz 
us 
at, 
ne 
ed 
ic, 
a 
re, 
be 
ict 
yn 
ns 
“al 
li- 
of 
of 


304 PHILIP HARTMAN AND AUREL WINTNER. 


3. It turns out that these properties of f(x,y) are essentially all that is 
needed for the truth of the assertions of (i) and (ii). In fact, even the con- 
tinuity of f(x,y) at any point of the line z = 0 can be replaced by considerably 
less restrictive conditions, to the effect that, barring the immediate vicinity of 
the origin, f(z, y) is bounded away from zero. The fundamental assumptions 
can then be formulated as follows: 


ASSUMPTION (*). Let $(x), where 0 Sa, be a positive, continuous 
function satisfying (4). Let f(x,y) be defined and continuous on the (partly 
open) rectangle 


and have the property that 1/f(x, y), outside of any fixed vicinity of the origin, 
is bounded for sufficiently small x, and that f(x,b) and f(x,—b) do not 
vanish for 0 << and are of opposite signs. 


Clearly, (i) is contained in the following theorem: 


(1) If ¢(x) and y) satisfy (*), and tf f(z, b) > 0 and f(a,—b) <0 
when 0 <2 Sa, then every solution path of (3) issuing from an 
interior point (2, Yo) of the rectangle (11) can be continued for all positive 
x(< 2%), and (10) holds for all continuations. 


Correspondingly, the existence statement of (ii) is contained in the 
following : 


(II) If and f(z, y) satisfy (*) and if f(x,b)< 0 and f(2,— b) >0 
when 0 << xa, then there exists for 0 < 2 Sa at least one solution path 
y =y(x) of (3), and (10) holds for all such paths. 


Finally, the uniqueness statement in (ii) is implied by the following: 


(II bis) Jf and f(x,y) satisfy the conditions of (II) and, in 
addition, f(x,y) 1s a non-increasing function of y for every fixed x > 0, then 
only one solution path y=y(x) of (3) can satisfy (10). 


4. In order to prove (I) and (II), it will first be shown in either case 
that if a solution y=y(x) of (3) exists for all small x >0, then (10) 1s 
satisfied by it. 


First, if y= y(z) is a solution for which the limit y(+ 0) exists, then 
this limit is necessarily zero. For if this were not the case, it would follow that 


( 
( 
( 
4 
( 
] 
t 
( 
7 
0 
( 
d 


ASYMPTOTIC BEHAVIOR OF SOLUTIONS OF A DIFFERENTIAL EQUATION. 305 
1/| f(x, y(x))| <@ holds for every sufficiently small « >0 and for some 
C>0. But (3) shows that y’(a) is of constant sign. Hence, 


f dt/¢(t) <<C f | y’(t)| dt 


+0 +0 


y(z) —y(+ 90). 


This, however, contradicts (4). 

Since f(z,b) and f(#,—b) are of opposite signs, there exists, corre- 
sponding to every x > 0, at least one y for which f(x,y) =0. For a fixed 
xz > 0, let y= y*(x) be the maximum of such values y. Similarly, let y= (2x) 
be the minimum of such values y. Then 


(12) <b, where 0< 25a, 
and 
(13) 0 as t>+0. 


The limit relations (13) are consequences of the fact that 1/f(z, y) is bounded 
for small z outside of any vicinity of (0,0). 

What remains to be shown is that the limit y(+ 0) must exist. This is 
obvious if y(x) is monotone for small x, since the existence of y(z) implies, 
of course, that the curve y=y(a) remains in #. In the remaining case, 
where y(z) is not monotone for sufficiently small x, the derivative y’(2) must 
change its sign and therefore must be zero for certain arbitrarily small values 
of In other words, there must exist a sequence 2, - satisfying 
tn > 0 and f(an, =0. But then 


(14) min (x) S min y(z) S max y(z) S max y*(z2), 


For if (14) did not hold, it would follow that there exists an z = 2°, such 
that 7° < and, for example, 


(15) y(z°) > maxy*(r), where SUS 


This implies that f(z, y(x)) is not zero at e =~, and so it does not vanish 
on an open z-interval containing z= 2°. Let x* denote an end-point of this 
interval. Then f(«*, y(x*)) hence, by the definition of y*(z), 


(16) g{z*) S where 


But y(z) is monotone on the interval having the end-point 2*, since y’(z) 
does not vanish in the interior. Hence at one end-point, say at x = 2*, 


306 PHILIP HARTMAN AND AUREL WINTNER. 


y(x*) > (2°). 


But this contradicts (15) and (16). 

Since (13) and (14) imply (10), the proof of the last italicized statement 
is complete. 

5. In order to prove (I), there remains to be shown that, if y = y(2) 
is any solution path of (3) through an interior point (2, yo) of (11), then 
y(z) can be continued for all positive x(< 2). However, such a continuation 
could not be made only if y(z) approached the lower or upper boundary of 
R, as x approaches some positive number. But this is impossible, since 
y’(z) >0 if y(x) is near b, and <0 if is near —b. This 
establishes (I). 

In order to prove (II), let 2, 2%2,- - - be a sequence of numbers satis- 
fying @> 2%, > Since <0, there exists, on’ some small 
interval to the right of a solution path y= yi satisfying yi(71) =). 
By using arguments as above, it is easy to see that this solution can be con- 
tinued for all’ x contained in the interval =a. Similarly, let y= y2(2), 
where 22 = =a, be a solution of (3) satisfying y2(r#2) =b. Such a solu- 
tion path y = y2(x) may be so selected that y2(x) = y, (x) holds at every z 
at which both y,(x) and y2(x) are defined (for if at some 2°, 
then y2 may be chosen to be identical with y, when z= 2°). On proceeding 
in this manner, one obtains a sequence of solution paths y = yn(x), where 
n=1,2,---, such that each of the functions 


(17) Ym(2), (2), 


is defined on the interval x» = # =a and, for every fixed z in this interval, 
the sequence (17) is monotone non-increasing in n. Hence, the sequence is 
convergent on this closed interval. But, since f(z, y)/¢(x) is continuous on 
the rectangle zm S27 Sa,—bSyZb, the functions (17) are equicontinuous 
on the interval 7, = 2a. Hence, the convergence of the sequence is uni- 
form on this interval, which implies that the limit function of the sequence 
is a solution of (3). 

This completes the proof of (II). 

In order to prove (II bis), suppose that, for every fixed z contained in 
the interval 0 << «a, the function f(z, y) is a non-increasing function of 
y and that there exist two distinct solutions, say y= y'(r) and y= 
defined for all small- positive z. Then, by the last italicized statement, (10) 


holds for y= y'(x) and for y=y’(x). Since these solutions are distinct, 


ASYMPTOTIC BEHAVIOR OF SOLUTIONS OF A DIFFERENTIAL EQUATION. 307 


there is some positive at which y'(xz) —y?(z) > 0. By con- 
tinuity, this inequality holds for all sufficiently close to 7°. Thus it follows 
from (3), by subtraction, that y*(x) —y?(z) has a non-positive derivative 
and is, therefore, non-increasing in a vicinity of 2°. Accordingly, if x is 
sufficiently near and less than 2°, 


(18) y' (xz) —y? (2°) — (2°) > 0. 


This argument also shows that y*(z) — y?(«) is non-increasing for all positive 
a= -°, and so (18) is valid for all such x But (18) implies that 


0—y'(+ 0) (4+ 0) Z (2°) — > 0. 
This contradiction proves (II bis). 


6. As seen above, (i) and (ii) are simple corollaries of (1), (II) and 
(II bis). In what follows, other applications of the general theorems will be 
deduced, by considering the Briot-Bouquet equation (1) and its generalization 
(2) under very relaxed restrictions on the non-linear terms. 


*(iii) ~=Let A= 1, and let g(a, y) be a real-valued function which is con- 
tinuous on the rectangle 


(6) 0S 256, —bsSySb 
and satisfies condition 
(19) 9(0,y)=o(|y|) as yO; 


finally, let p be any positive number. Then (6) contains a rectangle (9), 
having the property that all those solutions of the differential equation 


(20) wry’ = py + 9 (2, 
which issue from points of (9) satisfy (10). 


In fact, if (20) is identified with (3), then (4) is satisfied by $(7) =2°, 
since A= 1. Hence, it is sufficient to ascertain that the conditions required for 


f(x,y) = py + 
in (I) are satisfied if the rectangle (11) in (I) is replaced by some rectangle 


(9). But this is obvious from p > 0 and from (19). In fact, instead of (19), 
only 


is 
1s 

ce 

in 

of 
+), 
0) 
ct, 


3808 PHILIP HARTMAN AND AUREL WINTNER. 


(19 bis) 19(0,4)|<ply| for small | y|, (0<|y| <0’), 


is needed. 
Similarly, (11) implies the following theorem: 


(iv) If the assumptions of (iii) are satisfied, but (20) ts replaced by 
(21) wry’ =— py + 9(2,9); (—p<09), 


then there exists at least one solution y=y(x) which exists for all small 


x > 0, and all such solutions satisfy (10). 


Remark. In order to assure that the solution established by (iv) is 
unique, it is sufficient to impose on g(x,y) the monotony conditions which 
render (II bis) applicable to the particular case (21) of (3). 


QUEENS COLLEGE, FLUSHING, N. Y. 
THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] H. Dulac, Curvas definidas por una ecuacién diferencial de primer orden y de 


primer grado, Madrid, 1933. 

[2] H. Liebmann, “ Geometrische Theorie der Differentialgleichungen,”’ Encyklopaedie 
der mathematischen Wissenschaften, III D8 (1914). 

[3] O. Perron, “ Beweis fiir die Existenz von Integralen einer gewoéhnlichen Differen- 
tialgleichung in der Umgebung einer Unstetigkeitsstelle,” Mathematische 
Annalen, vol. 75 (1914), pp. 256-273. 

{4] A. Wintner, “ Unrestricted Riccatian solution fields” (to appear elsewhere). 


| 
A 


CONVERSE LINEARITY CONDITIONS.* 


By R. H. Bine. 


1, Introduction. A point is a belween-point of the plane set M if it is 
between two points of M; it is a middle-point of M if it bisects some interval 
having its end points on M. The function f(a) satisfies the between-point 
condition on the set F of z’s if for each 2 of H, the point of the graph of f(z) 
having 2 as an abscissa is a between-point of the graph of f(x) ; f(x) satisfies 
the middle-point condition on F if for each x of EL, the point of the graph of 
f(x) having 2» as an abscissa is a middle-point of the graph of f(z). 

Examples of functions satisfying the middle-point condition on their 


ranges are 
f(z) =me+k and f(x) =sing ow). 


It has been shown [2, p. 253; also see 1] that a continuous function 
f(z) (aS a5 Dd) is linear if it satisfies the middle-point condition for values 
of x between a and b. Theorem 1 is a modification of this result. 


THEOREM 1. The continuous function f(x) (aS ts linear tf it 
satisfies the between-point condition for values of x between a and b. 


Proof. Letting F(#) denote the function whose graph is the interval 
having end points at [a,f(a)] and [b,f(b)], we consider 


(1) u(x) = f(x) —F (2) (e=2= $). 


We note that w(x) satisfies the between-point condition for values of z between 
aand b and that u(a@) = u(b) =0. As u(x) is continuous and has a closed 
range, it takes on a least upper bound on a closed subset F of its range. 
If x) is the maximum element of # and if u(ao) ~0, then w(x) does not 
satisfy the between-point condition at 2). Therefore 


(2) u(x) S0 
Considering the greatest lower bound of u(x), we find that 


(3) u(x) (eSe= bb). 


* Received February 11, 1946. Presented to the American Mathematical Society, 


April 27, 1946. 


309 


l- 


310 R. H. BING. 


From (2) and (3) we have that 
(4) u(x) =0 (a<z<b) 
and from (1) and (4) that f(a) is linear. 

THEOREM 2. The continuous function f(r) (a<a2<b) is linear if 


b 

f f(x)dx exists and for each x between a and b there are an h =h(a) 
a 

and a k =k(x») such that 


Lo Lotk 
(5) (%)— f(2)dz] nl f f(x) dx — kf (ao) ] 


Proof. First, we shall show that there is a continuous function f(z) 
such that (a<r<b). 
Assume that (b,y:) and (b,y2) are limit points of the graph of f(z). 
b 
Suppose that c and d are values such thata<c<d<b. Since f f(x)dx 


d 
exists, f f(y)dy (aS2=4) is continuous and takes on a maximum M 


and a minimum m. Let L(x) and R(x) (aSx2=b) be continuous linear 
functions such that L(b) = R(b) = (y: + y2)/2, L(x) > f(x) > R(z) 


d 
(c<a<d), L(t) >0>R(z) L(c)dx > M and 


d d 
m> f R(x)dx. Since L(x) >f(x) (ec <x<d) and L(y)dy>M 


(a=xSc), it follows that 


d 
(6) (f(y) —L(y)|dy <0 


Let r and s be values between d and b such that r<s, f(r) > L(r) 
and f(s) < R(s). Denote by 2’ the largest value for which f(x) — L(z) 
(d= xs) assumes a maximum. Assume that there are values h and k 
such that (5) is satisfied for 2’ equal to x. We note that if f(z) is linear, 
then (5) holds for all h and k such that aSa—h<a<cxm+kSb. 
Hence, 


(7) k?{h[f(2’) —L(2’)] [ f(x) — L (a) |dz} 


— L(x) — k[f(v’) — L(2’)]}. 


ti 


CONVERSE LINEARITY CONDITIONS. 3811 


From (6) and the fact that f(a’) — L(2’) is a positive maximum for f(x) 
—I(«) (das), it follows that the first member of (7) is not negative. 
Therefore, 


(8) — Le) = — 


Since x’ is the largest value for which f(z) — L(x) (d= 2s) assumes a 


a’ +k 
maximum, it follows from (8) that 2 +k>=s and f f(z)dz > 0. 
8 


Let / be the set of all values x such that s=2=b and f- Lf(y) 
—L(y)]dy=0. By the methods of the last paragraph we find that b is 


b 

either a point or a limit point of £. Since if f(x)dz exists, it follows that 
b 

f [f(z) —L(«)|dx = 0. A similar argument leads to the contradiction that 


b 
f [f(c) — R(x) ]de < 0. 
8 
Assume that f(z) > «© as r—>b. There is a continuous linear func- 


tion G(x) such that f(x) > G(x) (c<a<d), G(x) <0 (a<x<ad), 
d 
m > f G(x)dzx and there is a value ¢ between d and b for which f(t) < G(t). 


Let x’ be the largest value for which f(z) —G@(z) (da < b) assumes its 
minimum. But there are not an h and a k such that (5) is satisfied for 2’ 
equal to x. Likewise, the assumption that ~— as b leads to a 
similar contradiction. Hence, f(a) may be defined so as to be continuous at b. 
Also, it may be defined so as to be continuous at a. Therefore, there is a con- 
tinuous function (aS ab) such that (axxr<b). 

Let F(x) (a= 2b) be a continuous linear function such that f(a) 
=F (a) and f(b) = F(b). Assume that — F(x) is positive for some 
value between a and b. Let 2’ be the largest value for which f’(2) — F(z) 
takes on its maximum. But there are not an h and a k& such that (5) is 
satisfied for x’ equal to a». Also, the assumption that f(z) — F(z) is negative 
for some value between a and b leads to a contradiction. 


The continuous function f(x) as linear if 
f(x)dz exists and if for each x between a and b there is an h =h(2o) 


a 
such that 
= (1/2h) f f(«)dx 
2. Middle-points of graphs. We shall use Theorem 3 in the next sec- 


tion. Theorem 4 is included because it is of interest for itself. 


) 
) 
k 
). 


312 R. H. BING. 


THEOREM 3. If (2,41) and (%o, yz) are middle-points of the graph of 
the continuous function f(x) (a<x <b) and if 41 < Yo < yo, then (ao, yo) 
is a middle-point of the graph of f(x). 

Proof. Let u(x), wo(x) and w2(a) be functions whose graphs are the 
images of the graph of f(z) under a rotation of an angle of w radians about 
(20, 1), Yo) and (Xo, y2) respectively. As < yo < ys, we have 
(9) < U(x) < —b < 
Since (20, ¥:) and (2, y2) are middle-points of the graph of f(x), there are 
values 21, 2 such that for 11,2 we have 


(10) ui (zi) =f (zi) [ao < 2 < min (b, 2% —a)]. 
From (9) and (10) it follows that 
(11) Uo(%1) and < 


From (11) and the continuity of f(z) and wo(a), it follows that there is a 
number 2; between 2, and 22 such that =f (23). Then (20, yo) bisects 
the interval from [23,f(xs)] to [2a — 23, f(2a%—23)] and is therefore a 
middle-point of the graph of f(z). 

THEOREM 4. If the continuous function f(x) (ax<x4<b) satisfies the 
middle-point condition on its range and for no value c between a and b is 
either f(x) (ax or f(x) linear, then the set of all 
middle-points of the graph of f(x) consists of the sum of two connected 
domains plus a subset of thevr boundaries. 

Proof. The existence of a continuous function f(z) (a< a4 <b) satis- 
fying the middle-point condition and being nonlinear on every connected 
subset of its range will be shown in the next section. 

First, it will be shown that if (2, y:) and (ao, yz) are middle-points of 
the graph of f(x) and if y: < yo < y2, then there is a positive number e such 
that if |8| <«, (w+, yo) is a middle-point of the graph of f(x). Employ- 
ing the notation and the methods used in the proof of Theorem 3, we have 
that there are an z, and an 22 such that 

> [ao << < min 2a) —a) | 
and 

Uo(t2) < f (a2) [ao < < min (b, 2a —a) J. 
From the continuity of u(x) and f(a), we find that there is a positive number 
such that if | 28| then 

Uo(%1) > f(a + 28) [ay + 26 < min (b, 22) —a) | 


and 


Uo (22) < + 28) [ao < + 26 < min (b, 2a) —a) |. 


CONVERSE LINEARITY CONDITIONS. 313 


From the continuity of uwo{z) and f(x), there is an 2; between x, and 22 such 
that = f + 28). Then (2 + 8, yo) is a midpoint of the interval 
from [2% — 23, f (2% — 23) ] to [x3 + 28, f (vs + 28) ]. 

There are positive numbers «, and ¢ such that if |8|<« (i—1,2), 
then [ao + 8, (yo + yi) /2] is a middle-point of the graph of f(z). Applying 
Theorem 3, we find that a connected domain contains (2%, yo) which contains 


only middle-points of the graph of f(z). 

We shall show that if z» is a number between a and (a+ b)/2, then there 
. is more than one middle-point of the graph of f(z) with an abscissa equal to Zp. 
Assume that (20, y) is a middle-point of the graph of f(x) only if y= f(a). 
If h is a positive number such that a << 2)—h, then the midpoint of the 
segment from —h,f(%.—h)] to [a +h, f(@o +h)] is a middle-point 
of the graph of f(x) having an abscissa equal to 2. The ordinate of this 
midpoint is f(%) because (2, y) is a middle-point of the graph of f(x) only 
if Hence, f(z) (a< x < 2x) —<a) is symmetric with respect to 


]. As f(a) is continuous at 2x, —a, there is a continuous function 
(aS such that g(r) —f(x) for values of x between a 


and 2% —a. Since g(a) is not linear, we have by Theorem 1 that there is a 
number x, between a and 2a) such that is not a middle-point 
i of the graph of g(x). Since g(x) is symmetric with respect to [o, 9(%o) ], 
we may assume that x, < 2. Then [2,,f(2,)] is not a middle-point of the 
i graph of f(x). Hence, the assumption that there is only one middle-point of 

the graph of f(z) with an abscissa equal to x leads to the contradiction that 

f(x) does not satisfy the middle-point condition at 2. 

‘ If x is a value between a and (a+ 6)/2, then the nondegenerate con- 
d nected set of all middle-points of the graph of f(a) with abscissas equal to 2o 
is covered by two points plus a domain consisting of middle-points of the graph 
f of f(x). Then the set of middle-points of the graph of f(x) with abscissas 
h less than (a + b)/2 consists of a simply connected domain D plus a subset of 
p its boundary. Also, the set of middle-points of the graph of f(x) with abscissas 
bi greater than (a + b)/2 consists of a simply connected domain D’ plus a subset 
of its boundary. The set of middle-points of the graph of f(z) with abscissas 
equal to (a + 0)/2 is a subset of both the boundary of D and the boundary 

of D’. 

3. Functions satisfying the middle-point condition. An example 
or [2, pp. 253-255] has been given of a continuous nonlinear function f(z) 


(a<«a<b) which satisfies the middle-point condition on its range. The 
graph of this function is the sum of a countable number of straight line 
intervals. One might wonder whether a continuous function f(x) (a << « < b) 
which satisfies the middle-point condition on its range could be nonlinear on 


each subinterval of its range. 


9 


312 R. H. BING. 


THEOREM 3. If (Xo, 41) and (%o, y2) are middle-points of the graph of 
the continuous function f(x) (a< a <b) and if 91 < yo < ys, then (20, yo) 
is a middle-point of the graph of f(x). 

Proof. Let and u2(x) be functions whose graphs are the 
images of the graph of f(x) under a rotation of an angle of 7 radians about 
(Zo, Y1), (Los Yo) and ¥2) respectively. As yi < yo < ys, we have 
(9) < U(X) < (2a —b< a < —a). 
Since (2, y:) and (2%, ¥2) are middle-points of the graph of f(x), there are 
values 21, Z2 such that for 1 1, 2 we have 


(10) ui (xi) = f(x) <2 < min (b, 22) —a)]. 
From (9) and (10) it follows that 
(11) Uo(t1) > f(a.) and < f(z). 


From (11) and the continuity of f(z) and wo(z), it follows that there is a 
number 2; between 2, and 22 such that wo(x;) =f(a3). Then (20, yo) bisects 
the interval from [@3,f(x3)] to [21% — <3, f(2% —23)] and is therefore a 
middle-point of the graph of f(z). 

THEOREM 4. If the continuous function f(x) (a<2<b) satisfies the 
middle-point condition on its range and for no value c between a and b is 
either f(z) (ax or f(x) (ec<4<b) linear, then the set of all 
middle-points of the graph of f(x) consists of the sum of two connected 
domains plus a subset of thewr boundaries. 

Proof. The existence of a continuous function f(z) (a< a4 <b) satis- 
fying the middle-point condition and being nonlinear on every connected 
subset of its range will be shown in the next section. 

First, it will be shown that if (2, y:) and (o, yz) are middle-points of 
the graph of f(x) and if y: < yo < ye, then there is a positive number e such 
that if |8| «, +, y) is a middle-point of the graph of f(x). Employ- 
ing the notation and the methods used in the proof of Theorem 3, we have 
that there are an x, and an 22 such that 

> f << a, < min 22 —a) ] 


and 


Uo(X2) < f (#2) [ao < < min 2a) —a)]. 
From the continuity of w(x) and f(x), we find that there is a positive number 
e such that if | 28| < «, then 

> f(a, + 28) [ao < + 286 < min (b, 22) — a) | 


and 


Uo < f (a2 + 28) [to < + 28 < min (b, 2x) —a)]. 


I 

] 

( 
if 
0 
01 

(¢ 
oT 

in 
ear 


CONVERSE LINEARITY CONDITIONS. 313 


From the continuity of wo{x) and f(x), there is an x; between 2, and x2 such 
that = f (xz + 28). Then (2) + 8,40) is a midpoint of the interval 
from [2% — 2s, to [a3 + 28, f (as + 28) ]. 

There are positive numbers ¢e, and e such that if | 3| <e (t—1,2), 
then [2 + 8, (yo + yi)/2] is a middle-point of the graph of f(x). Applying 
Theorem 3, we find that a connected domain contains (20, yo) which contains 
only middle-points of the graph of f(z). 

We shall show that if 2» is a number between a and (a+ b) /2, then there 
is more than one middle-point of the graph of f(x) with an abscissa equal to 2p. 
Assume that (xo, y) is a middle-point of the graph of f(x) only if y =f (zo). 
If h is a positive number such that a < 2»—h, then the midpoint of the 
segment from [x —h,f(t—h)] to [a +h, +h)] is a middle-point 
of the graph of f(z) having an abscissa equal to 2. The ordinate of this 
midpoint is f(2o) because (ao, y) is a middle-point of the graph of f(z) only 
if y=f(%). Hence. f(x) (a< x < 2x)—~a) is symmetric with respect to 
]. As f(a) is continuous at 22) —<a, there is a continuous function 
g(x) (aS 2S 2x,—a) such that g(x) =f(x) for values of x between a 
and 2a —«a. Since g(x) is not linear, we have by Theorem 1 that there is a 
number 2, between «a and 22) —a such that [21, g(z:)] is not a middle-point 
of the graph of g(x). Since g(x) is symmetric with respect to [%o, g(«o) ], 
we may assume that 7; < 2%. Then [2,,f(2:)] is not a middle-point of the 
graph of f(z). Hence, the assumption that there is only one middle-point of 
the graph of f(x) with an abscissa equal to x, leads to the contradiction that 
f(x) does not satisfy the middle-point condition at 2. 

If x is a value between a and (a+ b)/2, then the nondegenerate con- 
nected set of all middle-points of the graph of f(x) with abscissas equal to 2o 
is covered by two points plus a domain consisting of middle-points of the graph 
of f(z). Then the set of middle-points of the graph of f(z) with abscissas 
less than (a + b)/2 consists of a simply connected domain D plus a subset of 
its boundary. Also, the set of middle-points of the graph of f(z) with abscissas 
greater than (a + 6) /2 consists of a simply connected domain D’ plus a subset 
of its boundary. The set of middle-points of the graph of f(z) with abscissas 
equal to (a + b)/2 is a subset of both the boundary of D and the boundary 
of D’. 

3. Functions satisfying the middle-point condition. An example 
[2, pp. 253-255] has been given of a continuous nonlinear function f(z) 
(a<«<b) which satisfies the middle-point condition on its range. The 
graph of this function is the sum of a countable number of straight line 
intervals. One might wonder whether a continuous function f(z) (a<24< b) 
which satisfies the middle-point condition on its range could be nonlinear on 


each subinterval of its range. 


9 


314 R. H. BING. 


THEOREM 5. There exists a bounded function F(x) (ax <b) having 
a derivative everywhere on its range, satisfying the middle-point condition for 
values of x between a and b and being nonlinear on each subinterval of its 


range. 
Proof. Designate by do, a1, *, * the values 1, 3/4, 9/16, 

(3/4)",- and by a4, @2,°**, the values 5/4, 23/16,---, 

2— (3/4)",---. Let 

(12) F(z) (— 1)*2z? [ + an) /5 An |. 


\ 


For other x such that 0 = 2= 2. F(z) is defined so as to be nonlinear on 


each subinterval, so as to have a derivative at every point of its range and s0 
that | F(x)| = 2?. See the figure. 

We shall show that for each x) between 0 and 2, there are middle-points 
(2, and y2) of the graph of F(x) such that y; > and y2 << — 
An application of Theorem 3 will give Theorem 5. 

Let W, be the set of points on the z-axis whose abscissas 2 satisfy 
+ dn) /5 Say. We shall show that if is a number between 0 


4 
4 
y 
4 
4 
a“ 
5 
, 
ao 
~ 
~ 


CONVERSE LINEARITY CONDITIONS. 315 


and 2, then (2,0) is both a middle-point of Won and a middle-point of 
DWensi- 

Suppose that (2,0) is not a middle-point of any Won... There is a 
number j such that (2,0) is between a point of W2j., and a point of Woj-1. 
A computation shows that if 7 > 0, then (2,0) bisects an interval having its 


end points on Wo: + Wo;-2 and > Wen. respectively; if 7 << 0, then (2, 0) 
j 


j 
bisects an interval having its end points on Woj.s + Wojir and > Won-+ 


respectively ; if 7 = 0 and a = 1, then (20,0) bisects an interval from 
to W, + W;; if 7 =0 and 1+ (7/160) (3/4)?" S S1-+ (7/382) (3/4)? 
for m=0,1,- --, then (2,0) bisects an interval from Womss to W-om-s. 
Hence, each point of the z-axis between (0,0) and (2,0) is a middle-point 
of SWens:. Likewise, each such point is a middle-point of Won. 

There are points (z,,0) and (22,0) of 2 Wens: such that = (2 + 22) /2. 
We have by (12) that |2o,— (a.? + 227) /2] is a middle-point of the graph 
of F(x). But + 2,7) /2 Likewise, there is a point 
(Zo, ¥:1) Which is a middle-point of the graph of f(z) and such that y: > 2”. 
We have by Theorem 3 that F(z) satisfies the middle-point condition for 
values between 0 and 2. 


THEOREM 6. If each between-point of the graph of f(x) (ax x<b) 
is a middle-point of this graph and if f(x) (aa < b) has a derivative at 
z=a, then f(x) is linear. 


Proof. There is a continuous linear function L(z) (a2 < b) whose 
greyh is tangent to the graph of f(x) at [a, f(a) ]. Assume that there is an 2 
between a and b such that f(r) ~ D(a). Let M(x) be the function whose 
graph is the interval joining [a, f(a) ] and [20, f(z) ]. As the graph of L(z) 
is tangent to the graph of f(z) at [a,f(a)], there is an 2, such that 
|L(x) —f(x)| < | M(x) —f(zx)| for values of x between a and 2. Then 
[(2a + 2,)/3, M([2a + 2,]/3)] is a limit point of between-points of the 
graph of f(x) (a<x<b) but it is not a limit point of middle-points of 
this graph. 


THEOREM 7. Suppose that AB is an interval from the point A to the 
point B. There exist two totally disconnected closed subsets H and K of AB 
such that H: K = A+ Band such that each point of AB — (A + B) is both 
a middle-point of H — (A+B) and a middle-point of K —(A+B). 


Proof. First, we shall describe a totally disconnected closed subset S(J) 


of an interval J. Let F, be the segment consisting of the middle one-fifth of J. 
Then J 


EF, is the sum of two intervals. Let 2. be the sum of two segments 


n 
0 
y 
0) 


316 R. H. BING. 


each of which is the middle one-tenth of one of these intervals. In general, 
let His. be the sum of all segments s such that s is the middle 1/(5- 2*) of a 
component of J—(#,-+ £.+:-:-+H#;). The desired point set S(Z) is 
By + + 

We shall note some properties of S(Z) that make it useful in proving 
Theorem 7. It is closed and totally disconnected. We shall show that if PQ 
is a subinterval of J such that P is an end point of J, then 


(13) measure S(J)- PQ =—1/2 length PQ. 


Since #, + £.+-- +: is dense in 7, (13) is true if it holds for each point 
Q of Hi + #.+---. Assume that (13) is false. Let £; be the first ele- 


ment of £;,#2,- + - containing a point Q@ such that-(13) does not hold, 
For convenience, we shall assume that the length of J is 1. Then the measure 
of is 1/5+1/(5-2) +: and the measure of 


S(Z) is 1— 2/5 = 3/5. The length of each component of EF; is 1/(5- 4**) 
and the length of each component of J— (£,+ #.+:-:-:+ H;) is 
(1/28) {1— [1/5 +1/(5-2) $1/(5-2)]} 
= (3- 24+ 2)/(5- 4¢). 
The measure of the common part of S(/) and a component of / — (£, + FE, 
+----+H;) is (1/2*) (3/5). If Q isa point of £,, the measure of S(Z) - PQ 
is 3/10 and the length of PQ is no more than 3/5. Therefore, /, is not &). 
If Q” is the last point of P+ £,+ £,+-+--+ £j. on PQ in the order 


from P to Q, then 
measure S(J) - PQ’ = 1/2 length PQ’. 


But, 
measure S(J) - QQ’ = (1/2*) (3/5) 
and 
length QQ’ S (3-24 + 2)/(5- 4+) + 1/(5- 
= (3-2'+6)/(5- 
Then, 


measure - QQ’ = 2*/(2' + 2) length QQ’. 
Hence, (13) holds for all Y in J. Using a similar line of thought, we find 


that if UV is a subarc of J not intersecting 7, + F.+:--+ H; and U is 


a point of #;, then 


(14) measure S(J)-UV = 2/3 length UV. 


Each middle-point of J is a middle-point of S(J). To see this, let 7’ and § 
S(I’) be the images of J and S(/) under a rotation of 7 radians about a § 
middle-point R of I. If PQ is the common part of J and I’, we have by (13) 


find 


7 is 


and 
ut a 
13) 


CONVERSE LINEARITY CONDITIONS. 317 


that both the measure of S(J) - PQ and the measure of S(1’) - PQ is as much 
as 1/2 length PQ. Hence S(JZ) intersects S(J’) and PR is a middle-point of 

If K is an interval, let S() denote the image of S(J) under a similarity 
transformation of / into K. It will be shown that if J and K are two intervals 
that intersect in an interval Z and neither J nor K is as much as five times 
as long as the other, then S(J) intersects S(K) on a set of positive measure. 
It will follow that if M and N are subintervals of a straight line and neither 
M nor N is as much as five times as long as the other, then each middle-point 
of M+ N is a middle-point of S(M) + S(N). 

If neither J nor K is a subset of the other, then one end point of L is an 
end point of J and the other end point of L is an end point of K. By (13) 
we have that 

measure S(/) -L=—1/2 length 
and that 
measure S(K)-L=1/2 length L. 


As S(1)-L and S(K) - L are closed, they intersect on a set of positive measure. 
If K is a subset of J, let Hy; be the first element of F,, H2,- - - inter- 
secting K. Since 


length = 1/5 length J < length K, 


K is not a subset of #;. Let L’ be a maximal subinterval of Z containing no 
point of #;. By (14) we have that 


(15) measure S(I) - L’ = 2/3 length L’. 


But one end point of L’ is an end point of Z and therefore of K. By (13) 
we have that 
(16) measure S(K)-L’ = 1/2 length L’. 


Now (15) and (16) give that S(J) and S(K) intersect on a set of positive 


measure. 

Suppose that AB is the interval from (0,0) to (2,0). As in Theorem 5, 
designate by do, the values 1, 3/4,---, (3/4)",- - - and by 
@2,°**, the values 5/4, 23/16,:--, 2— (3/4)",---. Let Mn 


be the set of points (z,0) such that (Vans. + dn) /8 Sa Sd, let H—(A+B) 
= and let K— (A +B) = 

If (2, 0) is between two points of Meas, it is a middle-point of S(Moen.1). 
If (ao, 0) is a point of AB— (A + B) but is not between any two points of 
any Mon,;, there is a j such that (2,0) is between a point of Mj. and a 


al, 
fa 
is 
ing 
int 
1d, 
ure 
of 
E, 
PQ 


318 R. H. BING. 


point of M/.;-,. A computation shows that if 7 < 0, then (Yo, 0) is a middle- 
point of + M2j-s and is therefore a middle-point of S(Moj.1) 
+ S(Msj-1) + S(Moj-s) ; if 7 > 0, then (ao, 0) is a middle-point of S(M2j,3) 
+ + S(Moj-1) if j = 0 and x <1, then (2,0) is a middle-point 
of S(M;) + S(M,) + S(M_,); if = 0 and 1+ (7/256) (3/4)°™"# S251 
+ (7/32) (3/4)*"** for m=0,1,---, then (%,0) is a middle-point of 
S( Moms.) + S(M_om-3). Hence, each point of AB— (A + B) is a middle- 
point of K— (A+B). Likewise, it can be shown that each point of AB 
— (A+ B) is a middle-point of H— (A+B). 

THEOREM 8. There exists a bounded function F(x) (axa <b) satis- 
fying the middle-point condition and having a derivative on its range such 


that F(x) is nonlinear on each subinterval of its range and every between- 
point of the graph of F(x) is a middle-point of this graph. 


Proof. The function 


f(x) = (0<a2<1) 


2" 


satisfies all of the conditions of this theorem except the one of boundedness. 

Let AB be the interval from (0,0) to (1,0) and let H and K be totally 
disconnected closed subsets of AB such that H- K = A+ B and each point 
of AB— (A+B) is both a middle-point of H— (A+ B) and a middle- 
point of K—(A+B). We define F(x) (0<4< 1) to be a function 
having a derivative on its range and such that | F(x)| <2, the projection 
on the z-axis of the points of the graph of (x) for which F(z) —7z is the 
set H — (A + B), the projection on the z-axis of the points of the graph of 
F(x) for which =—z is the set K— (A—B) and F(z) is non- 
linear on each subinterval. Since each point between (0,0) and (1,1) isa 
middle-point of the graph of F(z) and each point between (0,0) and 
(1,—1) is a middle-point of the graph of F(x), it follows by Theorem 3 
that every between-point of the graph of F(x) is a middle-point of this graph. 


THE UNIVERSITY OF TEXAS. 


REFERENCES. 
1. E. F. Beckenbach, “On a characteristic property of linear functions,” Bulletin of 


the American Mathematical Society, vol. 51 (1945), pp. 923-930. 
2. R. Courant and D. Hilbert, Methoden der mathematischen Physik, vol. I1, Berlin, 


1937. 


of 


in, 


A DENSITY THEOREM FOR POWER SERIES.* 


By R. P. Boas, Jr. 


Let f(z) = S cnz" have | z| —1 as its circle of convergence. 

THEOREM. Suppose that {cn} is not bounded, but that there are numbers 
M, L and B, B > 1, such that |c,| SM when |A,—nB| < L. Then f(z) 
cannot be majorized by an integrable function y(@) in any sector of the unit 
circle of angle exceeding 2a = 2xr(1—1/f); that is, we cannot have 


(1) | f(re#)| Sy (4), 0=r<l, 


in any 0-interval of length exceeding 2%. The same conclusion holds if Cn 
does not approach zero, but c,,—>0 when |A»—nB| < L. In particular, in 
either case f(z) must have a singular point in every arc of |z| =1 of length 


exceeding 2a. 


If we have = 0 or even = O(7"), > 1, the existence of a singular 
point, but not the impossibility of (1), follows by Pélya’s gap theorem’ 
merely from lim A,/n = 8, with no further hypothesis on {cn}. For an illus- 


tration, we can take f(z) = (1— where An» = 2n +1, M=0, B=2. 

The case where Anji —An—> © and (1) fails in every sector was given 
by Duffin and Schaeffer. Our theorem corresponds to theirs as Poélya’s gap 
theorem corresponds to Fabry’s. Our proof is an adaptation of that given by 
Duffin and Schaeffer for their theorem. 

We can assume without loss of generality that (1) is satisfied in 
a—ySOS7+y, y >a. Then we have 


f w"*f(w)dw, 


where C is the curve made up of the are of |z| —1 from argz =2z—y to 
arg -+ y, the segments of argz=a-+y from to r=1(p< 1), 


and the are of | z| =p from arg z = — (r— y) to argz=a—y. Thus 


* Received November 16, 1945. 
+See, for example, N. Levinson, Gap and Density Theorems, New York, 1940, p. 89. 
2R. J. Duffin and A. C. Schaeffer, ‘“ Power series with bounded coefficients,” Ameri- 
can Journal of Mathematics, vol. 67 (1945), pp. 141-154; 153. 
319 


le- 
1) 
3) 
nt 

of 
le- Oc 
is- 
ch 
n- 

ly 
mn 
he 
of 
n- 

a 
id 

3 
h. 

a | 


320 R. P. BOAS, JR. 


(2) = F,(n) + F2(n), 
where 


F,(z) (pot) + t-2-1f (tet ) dt 


1 
t-*-1f ) dt, 
33-7 


When x— © through real positive values, F.(2)-—>0. Hence {F,(An)} is 
bounded if {c),} is bounded, and Fi(An) > 0 if 0. is an entire 
function of order 1 and type ~—y— logp, which can be made less than 
a —a==7/B by choosing p near enough to 1. A result of Duffin and Schaeffer * 
states that, if | An—n8|< L and the type of F(z) is less than 7/8, then 
F(x) is bounded for real positive z if {/1(An)} is bounded, and F,(r) > 0 
if F,(An) > 9. By (2), this means that {cn} is bounded or cn — 0, respectively, 
contradicting the hypotheses of the theorem. 


Brown UNIVERSITY. 


* Duffin and Schaeffer, op. cit., pp. 142-143. 


A SOLUTION THEORY OF THE MOBIUS INVERSION.* 


By AUREL WINTNER. 


1. If the summation index m runs through all divisors of n (including 
the divisor m = 1 and, if n+1, the divisor m =n), then the linear trans- 
formation 
(1) = (n == 1,2,-- -) 

of the infinite sequence (X1, into has a unique inverse. 
In fact, since the n-th of the equations (1) does not contain Xn, Xns2,°** 
and contains X,, the infinite system of equations (1) can be solved recursively. 
The explicit form of the resulting inversion of (1) is known to be the linear 
transformation 


(2) Xn =X p(n/m) (n =1,2,:--). 
Here »(1),(2),--: denotes the sequence of the absolute constants p(k) = 
+(4 +4), defined by the recursive formula and the initial condition 


(3) if k>1 and 
Lik 


respectively. An equivalent definition is that 
(3 bis) w(k)=0 or p(k) =(—1)’ 


according as & is not square-free or is the product of exactly y=v(k) distinct 
primes (with the understanding that k —1 belongs to the second case, with 
v(1) —0). 

Whether n is or is not square-free, let v(m) denote the number of its 
distinct prime divisors (e.g., v(12) 2). Then it is easily realized that 
2”(") is the number of the square-free divisors of n. Hence, if r(n) denotes 
the number of all divisors of n, then 


(4) 1S 2) = 
and, since v(2") —»(2) but 7(2*) =k+1, 


(5) lim sup == 


* Received January 23, 1946. 


5 
3 
1 
321 


322 AUREL WINTNER. 


In view of the sieve process, the infinite matrix defining the linear substi- 
tution (1) may be described as follows: The n-th column consists of the 


periodic sequence 
(6) (0)n-1,1, °°, (O)n1,1,° °°, 


of period n, where (0)n-1 denotes the block 0,- - -,0 of m—1 consecutive 
zeros (which is missing when n — 1; that is, every element of the first column 
is 1). Correspondingly, it is seen from (2) that the unique inverse of this 
matrix results by writing the elements of the sequence 


into the n-th column (so that the sequence »(1),p(2),- - - forms the first, 
the sequence 0,y(1),0,4(2),--~- the second,- -- column of the inverse 


matrix). Hence, if E (“Eratosthenes”) and M (“ Mobius”) denote the 
infinite matrices representing the transposed matrices of the linear substitu- 
tions (1) and (2) respectively, then the linear substitutions assigned by E 


and M are 

(8) E: = Yn, (n =1,2,-- 
m=1 

and 
oO 

(9) M: Yan = Zn; (n = 1,2,°--). 
m=1 


In fact, it is clear that the sequences (6), (7) represent the n-th rows of the 


matrices of (8), (9) respectively. 


2. Since (1) and (2) are reciprocal mates, and since (8) and (9) are 
the transposed systems of (1) and (2) respectively, (8) and (9) are formal 
reciprocal mates. In fact, (9) represents the classical “ Mobius inversion” 
of (8). 

However, the reservation just italicized is essential indeed. It is true that 
reciprocation and transposition are commutable in case of finite matrices, and 
it is also true that the matrices of the reciprocal substitutions (1), (2), being 
recursive, can be reduced to finite matrices. But this does not insure that the 
commutability of reciprocation and transposition remains legitimate, since 
the transposed matrices, that is, the matrices E, M defined by (8), (9), cannot 
be reduced to finite matrices. 

Actually, very little seems to be known as to the legitimacy of the Mobius 
inversion, (9), of (8). In fact, all that I find in the literature is contained 
in a remark of Hardy and Wright ([3], p. 237), according to which (9) is 


sure to be implied by (8) if 


A SOLUTION THEORY OF: THE MOBIUS INVERSION. 


@) 

| an| << 

n=1 
For the sake of completeness, it will be verified below that, without much 
additional labor, this criterion can be refined to 


Qin) | | <0 


n=1 


which, in view of (4) and (5), is an actual improvement. But any criterion 
of this type is of a trivial nature by necessity. 

In addition, any criterion of this type supplies the answer to a question 
which seems to be quite artificial. In fact, if (9) is thought of as solving the 
linear equations (8), then what appears to be natural is to subject the data 
Yn, rather than the unknowns 2p, to restrictions under which the formal solu- 
tion (9) of (8) is legitimate. However, it turns out that such restrictions 
cannot exist, simply because the homogeneous equations 
(10), = 0, (n == 1,2,-- -) 

m=1 
possess solutions ~#(0,0,-- +). But a result of Haar ([2], 
p. 178) states that 


either or |2,| 
n=1 n=1 
must hold for any solution (21, 22,° - +) of the homogeneous equations (10). 


And all of this indicates that the only natural approach to a solution theory 
of (8) consists in inquiring, not into arbitrary solutions 7 = (21, 22,° - -) 
of (8), but only into the solutions satisfying 

n=1 
solutions which will be called regular. 

The purpose of the present paper is to develop the few facts which happen 
to be true in a general solution theory from this point of view, which corre- 
sponds to the principles initiated by Hilbert (in his “bounded ” case, where 
< co is replaced by 2, |? ©; ef. the methodical considerations 
of Hellinger-Toeplitz [4] and Helly [5]). It turns out, among other things, 
that the range of the “solution theory” is by no means identical with the 
range of the “Mobius theory.” For instance, while it is true that Ex = y 
cannot possess more than one regular solution «= <x(y), it is possible that 


e 

re 

ul 

at 

id 

1g 

ne 

ce 

ot 

us 

ed 

is 


324 AUREL WINTNER. 


the latter exists but is not supplied by Mobius’ inversion x = My, the latter 
being non-existent (that is, the series (9) representing the components of the 
vector My become divergent, although (11) is satisfied). 


3. Since E,M result by transposing the matrices of the recursive equa- 
tions (1), (2), all elements of E,M below the diagonal vanish. Hence, the 
infinite series defining the elements of the matrix products EM, ME are finite 
sums, and so both EM and ME exist. Actually, both EM and ME represent 
the unit matrix. This is readily verified from (3) and from the fact that the 
sequences (6), (7) are the n-th rows of E,M respectively. Nevertheless, M 
cannot be denoted by E (or E by M*), since the situation is as follows: 


(1) The matrix M is both a right-hand reciprocal and a left-hand 
reciprocal of the matrix E. Furthermore, M is the only left-hand reciprocal 
of E. In addition, E is the only left-hand reciprocal of M. However, M is not 
the only right-hand reciprocal of E. In addition, E is not the only right-hand 
reciprocal of M. 


In order to prove this, suppose first that there exist two matrices, M, and 
M2, for which M,E and M2E become the unit matrix. Then (M,— M:2)E 
is the zero matrix. Hence, if M,— Mz is not the zero matrix, and if ¢ 
denotes one of its rows containing at least one non-vanishing element, then 
¢ = (¢,¢2," **) is a non-trivial solution of the homogeneous equations 
belonging to the transposed matrix of E. But (1) shows that these homo- 


geneous equations are 


+ en, (n =1,2,°--), 
m|n 
and imply, therefore, that c: —0, c.=0,:--. This contradiction proves 


that M, Mp. 

The uniqueness of the left-hand reciprocal of M follows by a repetition of 
this argument. This repetition is possible, since the homogeneous equations 
belonging to the recursive systems (2) have no non-trivial solution. 

Similarly, in order to prove that M is not the only right-hand reciprocal 
of E, it is sufficient to show that the homogeneous equations, Er = 0, possess 
a non-trivial solution x = (2;,22,° *-’). But such a solution is, for instance, 
In =p(n)/n, since (10) is then satisfied in view of the well-known relations 


m|n 


(Kluyver). Since (12) involves the Prime Number Theorem, it is worth § 


(12) 0, (n=1,2, 


rth 


A SOLUTION THEORY OF THE MOBIUS INVERSION. 325 


mentioning that other non-trivial solutions x of Er 0 may be constructed 


“elementarily,” namely, by ordinary Fourier analysis (in this regard, cf. 
Rajchman [6]). 

Finally, the last assertion of (1) follows if it is shown that the homo- 
geneous system My = 0 has a non-trivial solution y. But (9) shows that this 
homogeneous system is 
(13) S u(m)Ynm = 0, (n = 1, 


m=1 


which, in view of the case n —1 of (12), is satisfied by y: 1/1, y2 = 1/2, 


= 


4. Since the n-th component of the vector Ex, where x = (41, %2,° * *), 
is the infinite series 


m=1 
Ex exists if and only if this series converges for n=1,2,---. Corre- 


spondingly, the assertion that an z is a solution of Ex = y will always imply 
that this series converges for every n. Clearly, (11) is sufficient for the 
existence of Ez. 

In view of (3bis), these remarks remain valid if z, Er, Er ~y are 
replaced by y, My, My =a respectively. 

A general theory of unrestricted Mobius solutions is precluded by the 
following facts: . 


(II) Jf the vector x is a solution of Ex=y (for a given y), then 
Mobius’ vector My = M(Ez), instead of being the vector (ME)a (which, by 
(I), is the solution x), 


(1) need not exist; 
(11) may exist without being z. 


The proof of (i) will here be omitted, since an assertion much sharper 
than (i) is contained in the fact (IV) to be proved below. 

As to the remaining assertion of (II), it is sufficient to observe that, if y 
is the zero vector, then My exists and is the zero vector; and that the corre- 
sponding system Ex = y, that is, Ex = 0, possesses non-trivial solutions 2. 

All that happens in this proof of (ii) is that My is a solution z of Er = y, 
though it is not the given solution. Another possibility is that My is a vector 
not representing any solution: 


(III) Jf Mébius’ My exists (for a given y), it need not represent a 
solution x of Ex = y. 


e 
i- 
1e 
e 
it 
1e 
[ 
id 
al 
ot 
id 
id 
E 
n 
ns 
0- 
es 
of 
ns § 
$8 
ce, 
ns 


326 AUREL WINTNER. 


In fact, if yn = 1/n in y = (41, +), then (13) is satisfied, that is, 
My exists and is the vector 0. Hence, if My were a solution x of Er —y, 
it would follow that EO —y. But this is contradicted by yn = 1/n. 

Incidentally, it remains undecided whether there exists a suitable y for 
which Ex = y has no solution x. All that is clear is that Ex = y can never 
have a unique solution (simply because a solution of Er = y plus any solution 
of Er 0 is a solution of Er y). 


5. The first of the statements of (II) can be refined as follows: 


(IV) There exist vectors y for which the system Ex = y has a regular 
solution x, although Mobius’ solution My, instead of representing this x, does 


not exist. 


In other words, not even the restriction (11) can prevent the case (i) 
of (II). This will be proved in 6. 

As mentioned after (III), it remains undecided whether or not every 
vector y is representable, in terms of a suitable z, in the form y = Ez. On the 
other hand, it is easy to see that the answer is in the negative if @ is restricted 
by (11). In fact, if y= (41, y2,° - +) is so chosen as to violate 


(14) as 


then Ex = y cannot have a regular solution ex —z(y). In order to see that 
(14) is a necessary condition for the existence of a regular solution 2, it is 
sufficient to observe that the representation (8) of Ex = y implies the estimate 


m=1 k=n 


from which the relation (14) follows if (11) is assumed. 
In the same direction lies the following fact: 


(V) Fora given y, the system Ex = y has either no regular solution or 


a unique regular solution « 


If the assertion of (V) were false (for some fixed y), it would follow by 
subtraction that the homogeneous system Ex =O has a regular solution x 
distinct from x0. But this contradicts Haar’s result, quoted after (10). 


6. The proof of (IV) depends on the arithmetical function 


d=m 


(15) on(m) = 


din 


A SOLUTION THEORY OF THE MOBIUS INVERSION. 327 


of the two independent variables n,m (the summation index, d, runs over 
those divisors of » which do not exceed m). 
It is clear from (15) that 


(16) = +1) + 2) = 
by (3). It is also ¢lear from (15) that 
(17) —=ga(m) if 


The periodicity condition (17) implies that ¢,(m) is a bounded function of n, 
if m is fixed. On the other hand, (16) implies that ¢n(m) is a bounded func- 
tion of m, tf n is fixed. If an assertion of Bruns ([1], p. 132) concerning 
Mobius’ inversion were correct, it would follow that the function ¢n(m) is 
bounded uniformly in n and m together. However, it was recently shown 
({8], §9-§ 11) that this is not true, since 

(18) lim sup | dn(m)| = 


Trivial upper estimates are 


(19) | Sm 
and 
(20) | dn(m) | S 


In fact, since v(m) denotes the number of the distinct prime divisors of n, 
the number of all square-free divisors of n is 2”), and so it is clear from 
(3 bis) that 


d\n 


But (21) and (15) imply (20). On the other hand, (19) is clear from 
(3 bis) and (15). 
In order to deduce (IV) from (18), let 

(22) > 

n=1 
be an absolutely convergent series. Then the same is true of all of the series (8). 
In particular, y = Ex defines a vector y. For this y, the vector z is a solution 
of Er =y. On the other hand, substitution of (8) into (9) shows that the 
first component of the vector My is the repeated series 


328 AUREL WINTNER. 


co 


ie 


n=1 k=1 


Hence, in order to prove (IV), it is sufficient to show that (11) does not 
imply the convergence of this repeated series. 

Let sm denote the m-th partial sum of the exterior summation in (23), 
that is, let 


m ©) 

m=1 k=1 

This can be rearranged into 
oo n=m 

(25) Sm = p(n), 
j=1 nk=j 


a 
where the interior summation is extended over those positive integers not 
exceeding m corresponding to which there exists a positive integer & satisfying 
nk =j. This means that n runs through those divisors of j which do not 
exceed m. Accordingly, 


d=m 
(26) = 3 p(d). 
j=1 d\j 


Hence, if 7 is replaced by n, it is seen from (15) and (16) that 
(27) Sm = on(M) Zn, (m=1,2,: °°). 
n=1 


Since the absolute constants (15) defining the matrix of the linear sub- 
stitution (27) satisfy (18), an application of the general norm-principle 
(Lebesgue-Toeplitz; cf., e.g., [%]) shows that there exist absolutely con- 
vergent series (22) for which the sequence s;,52,- - - becomes divergent. 
This proves (IV), since sm is the m-th partial sum of (23). 


7. According to (V), the system Ex ~y has either no or just one 
regular solution x—-z(y). There arises the need for practicable criteria 
which, when applied to the data y:, y2,:- - of the system Ex = y, distinguish 
between the two cases. In this regard, some information is contained in the 


following theorem: 


(VI) In order that the data y;, y2,:* + be such that the system Ex =y, 
where y = (41, Y2,° * *), has a regular function x (which, by (V), is then 
unique) , 


(i) the condition lim yn, = 0 is necessary, 
n-—>0O 


@) 
(ii) the convergence of the series % Yn is not necessary, 


n=1 


a: 
4 
% 
‘ 
} 


A SOLUTION THEORY OF THE MOBIUS INVERSION. 


8) 
(iii) the condition %|yn| << © is not sufficient, 
n=1 


(iv) the existence of ane >0 satisfying &n*|yn| < © is sufficient. 
1 


n= 


not 
Needless to say, it is implied by (iii) that 


3), (i*)—(ii*) the conditions of (i) and (ii) are not sufficient, 


and by (ii) that 


(i11*)—(iv*) the conditions of (iii) and (iv) are not necessary. 


Of the four assertions of (VI), only (i) and (iv) are “criteria” (of the 
“practicable” type). Actually, (i) is a triviality, verified after (14), and 


(iv) can be improved as follows: 


(ivbis) Jf a given y= (41, Y2,° * +) satisfies the condition 


00 
> 2¥(n) | Yn | 00, 
n=1 


then Ex = y has a regular solution x. 


Here 2”‘") denotes the arithmetical function occurring in that improve- 
ment of the Hardy-Wright criterion which was mentioned at the beginning 
of 2. However, (iv bis) is quite different from the Hardy-Wright criterion 
“practicable” type, 
imposing the 2”'")-condition on the data yn, rather than on the unknowns 2p, 
of the system Er =y. Cf. (VIII) and (X bis) below. 

As is well-known, the number, v(m), of the distinct prime divisors of n 


and its improvement, since (iv bis) is a criterion of the 


satisfies the estimate 


(28) 


v(n) = O(log n) /log log n. 


Hence, 


(29) Qv(n) == ()(n*) 


holds for every « > 0. Consequently, in order to prove (iv), it is sufficient to 


verify (iv bis). This will not be done here, since a refinement of (iv bis) is 


contained in (VIII) below. 


Ly In order to prove (ii), let be a sequence of positive numbers 
ail satisfying (11) and 
| 
(30) | tn | = 0 
n=1 


(such sequences exist, since t(n), the number of all divisors of n, does not 


10 


329 
not 
ing 
not 
ple 
nt. 
| = 
ne 
— 


330 AUREL WINTNER. 


remain bounded as n—> ©). Since (11) is assumed, it is possible to define 


by (8) the components of a vector y = (y:, y2,:* *). For this y, the system 
Ex = y admits the solution « = (2, 22,- - -), and this x is a regular solu- 


tion, since it satisfies (11). However, since every z, was chosen to be positive, 


it is clear from (8) and from the definition of r(n) that 


Yn = (N) 
1 


n=1 n= 


Hence it is seen from (30), where x, > 0, that the proof of (ii) is complete. 


8. Of the four assertions of (VI), only (iii) remains to be considered, 
It is worth while to formulate (iii) as a dual of (IV), as follows: 


(VII) There exist vectors x= (a1,%2,:-* +) for which the system 
My = z has a solution y = (41, satisfying 


M2 


(11 bis) | yn| < %, 


1 


3 


although Mobius’ solution Ex, instead of representing this y, does not exist. 


It is clear from (9) and (3 bis) that, if y= (41, y2,° * *) is any vector 
for which the series 


ioe 
(31) = Yn 
1 


n= 


is absolutely convergent, then the vector My exists. Let this vector be denoted 
by x Then the case n —1 of (8) and the defining relations (9.) show that 


the first component of Ex is the repeated series 


GO ox 


n=1 k=1 


Since y satisfies My = 2, it follows that (VII) will be proved if it is shown 
that the absolute convergence of the series (31) does not imply the convergence 


of the repeated series (32). 
This is a precise dual of what has been proved for (22), (23) in 6. 


Correspondingly, the following proof parallels that given in 6. But the proof 


is by no means superfluous, since (I) and everything that follows (1) prevent 


a general principle of duality. 
Suppose that the series (31) is absolutely convergent, and let tn» denote 


the m-th partial sum of the exterior summation in (32), that is, let 


m 


= 
k=] 


n=1 k 


00 oc 7 
onl 
. 


A SOLUTION THEORY OF THE MOBIUS INVERSION. 


This can be rearranged into 


where the interior summation runs over those positive integers k corre- 
sponding to which there exists some n not exceeding m and satisfying nk =). 


Accordingly, 
co 


— Ys »(j/d). 


Hence, if j is replaced by n, 


(33) tn = Yn, 
n=1 


where ¢”"(m) denotes the interior sum, that is, 
(34) $"(m) = w(n/d). 


Since (33), tm,¢"(m) correspond to (27), Sm,;¢n(m) respectively, what 


corresponds to (1 ss is 


(35) lim sup | ¢"(m)| =o. 
n->XO, M->CO 
Hence, the proof will be complete if it is shown that (35) is true. 
To this end, let the summation index d in (34) be replaced by n/d. 
Then (34) appears in the form 


d=n/m 


o"(m) = X‘p(d). 


din 


Consequently, if n/m is not an integer, 
p"(m) + dn([n/m]) u(d), 


by (15). It follows therefore from (3) that (35) is implied by (15) and (18). 
In fact, the omission of the assumption that n/m is not an integer introduces 


an error which is bounded, hence such as to have no influence on (35). 


9. The content of (IV) is that there are data (41, y2,° - +) for which 
the regular solution (2,%2,:--) of (8) exists but is not attainable by 
Mobius’ inversion (9). However, the resulting pathological y-range does not 
contain data y = (¥:, yz," * *) Which have occurred thus far in the applica- 
tions of Mobius’ inversion to problems occurring in the analytic theory of 
numbers. Correspondingly, what really matters in those classical applications 


331 

ne 
nSm 

4 tm = Yj; 
u- f=1 nk=j 
‘ 

00 
od 

at 

q din 
nl 
ce 
6. 
of 
it 
te 


332 AUREL WINTNER. 


is néither just the existence of a regular solution nor just the existence of a 
Mobius solution, but the existence of both. 

Thus there arises the problem of delimiting the y-range within which 
Ex = y has a regular solution x represented by Mobius’ inversion, My. A regu- 
lar solution x of this particular type will be called hyper-regular. A complete 
characterization of the y-range of hyper-regularity involves properties of a y 
which are just as obscure as are, in view of (ii)—(iii) and (i*)—(ii*) in 7, 
the properties characteristic of a vector y belonging to the more inclusive 
range of regularity. However, the following sufficient criterion comprises 
more than what is needed in the classical applications. 


(VIII) If the data yn of the system Ex = y, where y = (41, y2,° °°), 


satisfy the condition 

(36) |y, | <0, 
1 


n= 


then there must exist an (or, according to (V), the) hyper-regular solution 
x of Ex=y. 
For instance, this will be the case if 


le 
(36 bis) | yn| << 
n=1 


holds for some « > 0, since (36 bis) is sufficient for (36), by (29). 

In order to prove (VIII), let x denote the vector My. This vector exists 
(simply because its components 2, are given by the series (9), the convergence 
of which is assured by (3 bis) and (36), since (36) implies that (36 bis) is 
satisfied by «==0). Furthermore, the components of z= My satisfy the regu- 
larity conditions (11). In fact, (9) shows that the series (11) becomes the 
repeated series on the left of the obvious inequality 

co Ww CO 
But it is clear from (21) that the double series on the right of this inequality, 
a series of non-negative terms, can be contracted into the simple series (36). 

This proves that, if (36) is satisfied, Mobius’ vector My exists and repre- 
sents either a regular solution of Ex = y or no solution at all. Hence, in order 
to complete the proof of (VIII), it suffices to rule out the second of these 


possibilities. 
10. The assertion is that the vector E(My) exists and is identical with 


the given vector y (provided that (36) is satisfied by the components of y). 
But substitution of (9) into (8) shows that the assertion E(My) = y can be 


written in the form . 


A SOLUTION THEORY OF THE MOBIUS INVERSION. 333 


(38) x = Yn) (n= 1,%,° ° 
m=1 k=1 


where it is understood that the convergence of the repeated series (38) is part 
of the statement. Accordingly, the proof of (VIII) will be complete if it is 
verified that (36) implies the truth of (38) for every n. 

First, (36) implies that 


M2 
M2 


(39) | | | | < (n= 1, 25° 


m=1 k=1 


In fact, whether (39) be true or false, the double series (39), having only 
non-negative terms, can be rearranged into 


j=1 nmk=j 

where the interior summation extends over the range of those values k corre- 

sponding to which there exists some m satisfying the condition nmk = j (in 

which both n and j are fixed). This means that the double series (39) is 

identical with 


and is therefore majorized by 
ie, @) 
=| 
j=1 


(simply because nk|j cannot hold for any n if it does not hold for n=1). 
But (21) shows that this majorant of the double series (39) is identical with 
the series (36). This proves that (36) implies (39). 

Since the repeated series (40) is just a rearrangement of the double 
series (39), and since the latter is convergent and consists of the absolute 
values of the terms occurring in the repeated series (38), it follows that the 
repeated series (38) is convergent and that, in addition, the assertion (38) 
can be rearranged, corresponding to (40), into 

00 


= = Yn; (n =1,2,: °°). 


j=1 nk| j 

Hence, in order to complete the proof of (38), all that remains to be 
ascertained is that the (finite) sum multiplying y; on the left of the last 
formula line is 0 or 1 according as j ~n or 7 =n. If j and the summation 
index k are replaced by m and d respectively, this assertion appears in the form 
(41) > p(a) = 


nd|m 


aC 


334 AUREL WINTNER. 


where (nm) denotes the infinite unit matrix. But (41) is true. In fact, if m 
is not divisible by n, then énm is 0 (since n =m would imply that m is 
divisible by m), and the sum on the left of (41) is vacuous (since nd cannot 
divide m for any d if n itself does not). In the remaining case, that is, if the 
quotient m/n is an integer, say k, the assignment nd|m on the left of (41) 
can be replaced by d|k, and so the truth of (41) follows from (3) in this case 
(simply because the integer k = m/n is greater than or equal to 1 according as 


m~Anorm=—n). 


11. The criterion (VIII), the proof of which is now complete, places 
the restriction on the data and is, therefore, an existence theorem. In contrast, 
the following (incomplete) dual of (VIII) will assume the existence of a 
solution of a certain restricted type; and all that will be claimed is that the 
assertion of (VIII) then becomes tautological in some respect. 


(IX) Jf Er=y has a solution x = (a, %2,° satisfying 


(42) | | < 0, 

n=1 
then this solution is hyper-regular. In fact, if Ex =y and (42) are satisfied, 
then My exists and is precisely x. 


The existence of My means, of course, the convergence of the series (8). 
Hence, the truth of (IX), no matter how elementary, is curious indeed, since 
the situation is as follows: 


(IX bis) The assumptions of (IX), which imply the existence of My, 

do not imply that 

co 
(43) yn| < 

n=1 
(although nothing short of (43) appears to guarantee the existence of My, 
that 1s, the convergence of all the series (9), if y= (4:1, y2,° °°) is a free 
variable ; actually, the yn are bound by (43) and Er = y). 


In order to see this, let x1, %2,- - - be a sequence of positive values satis- 
fying (42) and (30). The possibility of choosing such an x = (2, 22,° * *) 
is assured by (4) and (5). Let y= (41, °°) be defined by Ex 
This does define the values yn, since (42) implies (11) and, therefore, the 
convergence of the series (8). Accordingly, the assumptions of (IX) are 
satisfied. Nevertheless, (43) fails to hold. This follows from (30), if the 
restriction rz, > 0 is used in the same way as at the end of 7. 


A SOLUTION THEORY OF THE MOBIUS INVERSION. 335 


The assertion of (IX) is that (42) and (8) imply (9). But substitution 
of (8) into (9) gives 


oo 


(44) (3 = Zn; (n == 1,2,- °°). 


m=1 k 
Hence, the assertion of (1X) is that (42) implies (44). On the other hand, 
the proof of (VIII) in 10 consisted in verifying that (36) implies (38). 
And this depended only on the fact that (36) implies (39). Since (39) 
remains unaltered if the summation indices m,k are interchanged, it follows, 
on replacing every yi by the corresponding 2;, that it is superfluous to repeat 
the details. 


12. This formal procedure supplies some, mostly of course superficial, 
criteria relating to the problem which is the duai of Mobius’ inversion, namely, 
to the problem of the system My = z in which y is the unknown and Ez repre- 
sents the formal Mobius solution. The simplest fact which can thus be obtained 
is as follows: 


(X) For any given x, the system My=z has at most one solution 

y= (yi, satisfying 
This is a partial dual of (V). A complete dual would not postulate 
more than 
< =. 
the formal y-analogue of the assumption (11) of (V). That some restriction 
of y is necessary, follows from the fact that the system (13), which is the 
homogeneous system My = 0, has a solution y distinct from the trivial solution, 
y=0. 

In order to prove (X), suppose that My —7z has a solution, y= y(z), 
satisfying the assumption of (X). This means that both (9) and (36) hold 
for a certain y = (41, Y2," °°). But (21) shows that (9) and (36) imply 
(39). On the other hand, it is clear from (9) that the double series (39) 
majorizes the series (11). Hence, (11) is satisfied (and so, in particular, 
Ex exists). Consequently, ifn in (9) is replaced by nk, the resulting equations 
can be summed with respect to k. This leads to the relations 


Me? 


oO OO 
(45) = Tar; (n=1,2,---). 


1 m=1 


~ 


+ ? 

t 

e 

e 

s 

s 

af 

> 
). 
y, 

e 
s- 

e 

e 


336 AUREL WINTNER. 


In addition, the repeated series on the left of (45) is the repeated series 
(38). Since, as verified in 10, the assumption (36) implies that (38) is an 
identity, it follows that the repeated series on the left of (45) is identical 
with yn. Consequently, (49) means that y= Ez. Since y = Ex implies that 
y is determined by 2x, the proof of (X) is complete. 

It is also seen that (X) can be amplified as follows: 


(X*) Jf My=—z has, for a given vector x, a solution y satisfying the 
assumption of (X), then this solution must be Mobius’ formal solution, that is, 
the vector Ex, which exists, since x must be a regular solution of Ex = y by 


virtue of the y-assumption of (X). 


13. However, (X*) does not supply any criterion discerning between 
the two cases allowed by (X), the case of non-existence and the case of unique 
existence. Such a criterion, namely, an existence statement corresponding to 
a dual of (iv) in (VI), is contained in the fourth of the following assertions: 


(Xbis) If xis given, and y is the unknown, in the system My = a, then 


(i) the condition 


> | | < 0 
n=1 


is sufficient in order that Mobius’ formal solution, which is y= Ez, should 


actually be a solution of My = a, but 


(ii) the condition of (i) is insufficient for the existence of a solution y 
for which the restriction 
oo 
yn| < 
n=1 
is satisfied, although 
(iii) the somewhat stricter condition 
oo 
Xr(n) | an| 
n=1 


is sufficient for the existence of a solution y satisfying the restriction of (ii), 


and 
(iv) the still stricter condition 


00 
| < 0 
n=1 


4 


A SOLUTION THEORY OF THE MOBIUS INVERSION. aoe 


(to be satisfied by some « > 0) is sufficient for the existence of a solution y 
for which the restriction assumed in (X) is fulfilled. 


First, it is clear from the comments made at the end of 11 that, in order 
to prove (i), it is sufficient to ascertain that the z-assumption of (i) implies 
the sequence of conditions which results when y is replaced by z@ in (39). 
In other words, it is sufficient to ascertain that (36) implies (39). But the 
truth of this implication was verified after (40). 

What concerns (ii), it is enough to take a glance at the proof of the 
negation in (IX bis). 

Correspondingly, (iii) may be verified as follows: According to (4), the 
z-condition of (iii) implies the z-condition of (i) and so, by the assertion of 
(i), the existence of a solution y = (41, y2,°**) satisfying (8). Consequently, 
there exists a solution y satisfying 


Since the double series on the right can be contracted into the series the 
convergence of which is the z-condition assumed in (iii), the assertion of (iii) 
follows. 


14. It also follows that the x-assumption of (iii) implies the existence 
of a solution y = satisfying the inequality 


co oo 
S27) | yn 2° | |; 
=1 


n= n=1 m=1 


in which, however, the double series on the right need not converge. But this 
(non-negative) double series, be it convergent or not, can be rearranged into 


| a; | 


nm=j 
where it is understood that the interior sum denotes r*(j), if +* is the arith- 
metical function defined by 


(46) r*(n) = 
din 


If this is inserted into the last inequality, there results the following criterion 


q 
1 
l 
t 
/ 4 
B 
) 
4 co OO 
4 n=1 
( 
i= 
y | 
’ 


338 AUREL WINTNER. 


(which is sharp, since the inequality becomes an equality when every yp is 
chosen to be positive; cf. the end of 7). 


(iv bis) If the data of the system My =z satisfy the condition 


r*(n) | tn | ©, 


then there exists a.solution y= °°) satisfying the assumption of (X), 


This sharp criterion relates to (iv) in the same way as the fourth asser- 
tion of (VI) relates to (iv bis), 7. In other words, (iv) is a corollary, since 


(47) r*(n) = O(n) 
holds for every « > 0. In fact, since the logarithm of 

(48) tT(n) 1 

is subject to the well-known estimate 

(28 bis) log r(n) = O(log n) /log log n 
(Wigert-Ramanujan), the estimate 

(29 bis) t(n) = O(n‘) 

holds for every « > 0. Hence, (4) and (48) imply the estimate 


d\n d\n d|n 
which, in view of (46), is the assertion (47). 

Needless to say, the necessity of replacing the condition of (iii) by the 
condition of (iv bis) is due to the fact that 


(4 bis) L=r(n) =1*(n) 
but 
(5 bis) lim sup r*(n)/r(n) = ©. 


This is clear from (46) and (48), since 2” =1 but lim sup 2” = o, 
n->0O 


THE JOHNS HOPKINS UNIVERSITY. 


n=1 
i 


A SOLUTION THEORY OF THE MOBIUS INVERSION. 


REFERENCES. 


H. Bruns, Die Grundlinien des wissenschaftlichen Rechnens, Leipzig, 1903. 

A. Haar, Chapter I, no. 129 in G. Szegé and G. Pélya, Aufgaben und Lehrsdtze 
aus der Analysis, Berlin, vol. 1 (1925). . 

G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 
Oxford, 1938. 

E. Hellinger and O. Toeplitz, “Grundlagen fiir eine Theorie der unendlichen 
Matrizen,’ Mathematische Annalen, vol. 69 (1910), pp. 289-330. 

E. Helly, “ Ueber Systeme linearer Gleichungen mit unendlich vielen Unbekannten,” 
Monatshefte fiir Mathematik und Physik, vol. 31 (1921), pp. 60-91. 

A. Rajchman, “ Ueber eine paradoxe Eigenschaft gewisser bedingt konvergenter 
unendlicher Reihen,” Mathematische Zeitschrift, vol. 26 (1927), pp. 777-778. 

I. Schur, “ Ueber lineare Transformationen in der Theorie der unendlichen Reihen,” 
Journal fiir die reine und angewandte Mathematik, vol. 151 (1920), pp. 79-111. 

[8] A. Wintner, An Arithmetical Approach to Ordinary Fourier Series, Baltimore, 

1945. 


339 
] 
7 | 
j 
i 
4 
| 
4 
e 


METRICALLY HOMOGENEOUS SPACES.* 


By HERBERT BUSEMANN. 


The following note shows how certain questions of metric geometry re- 
lating to the so-called four-point properties ? can be unified and derived from a 
general theorem which is in turn a simple consequence of a geometric result 
obtained elsewhere.? The theorem states essentially this: a metric space R 
with suitable compactness and convexity properties has constant curvature 
when for any linear triple of points q, 7, and any fourth point p the distance 
px is a function $(pq, pr, xq, + zr). Thus it will be proved that ¢ is one of 
three specific functions (see formulas (1), (2), (3)) occurring in euclidean, 
hyperbolic, or spherical geometry respectively, whereas any of the four point 
properties assumes a priori that ¢ is one of these specific functions. 

The exact formulation of the theorem is this: 


THEOREM. The space R is a locally isometric map of a finite dimensional 
euclidean, hyperbolic, or spherical space * if and only if it satisfies the follow- 


ing five conditions: 


R is metric (with distance xy). 
R is finitely compact. 

R is conver. 

FR ts locally externally convez. 


If S(z,p) denotes the set of points with zx < p, then IV means more 
explictly this: for every point a there is a pa* > 0 such that for any two points 
p,q in S(a, pa’) a point r+<q with pq + qr = pr exists. 


V. For every point a there is a pa” > 0 and a function da(&1, €2, €s, &4) such 
that for any four points p, q, Tr, in S(a, pa?) with 
e= + 1, the relation 


* Received May 23, 1945. 

1 For the literature see Blumenthal [1], in particular pp. 69-84. 

* The result referred to states that spaces with locally linear bisectors have con- 
stant curvature, see Busemann [1] (quoted as B.) p. 268.—The author takes this 
occasion to point out that Theorem (1.21) in B, which characterizes the Hausdorff 
spaces with a finitely compact metrization, is not new, but was proved previously by 
H. E. Vaughan, see Vaughan [1], p. 532, Theorem 2. 

® More briefly we shall say that R has constant curvature. 


340 


| 
I. 

II. 
III. 
IV. { 
1 

= 

q t 


METRICALLY HOMOGENEOUS SPACES. 


px = pq, pr, 
holds. 


The theorem would not be true if V were required for «= 1 only, even if 
IV is replaced by external convexity in the large. An example is furnished 
by the space consisting of the three rays 0S r' < o (i—1,2,3), with the 
metric 
if t=], 


Ti 


This definition implies that the three origins 7‘ 0 are identified. It is 


easily seen that IV holds in the large and that 
px = max (pq — 7q, pr — zr) 


whenever gz -++ ar = qr, so that V holds in the large for e—1. But no 
neighborhood of +‘ 0 is even homeomorphic to the interior of a sphere 
of any 

The necessity of the hypotheses I to V is obvious because the Pythagorean 
theorems of the geometries with constant curvatures 0, — f*, f? yield 


respectively, 

(1) pa? = [pq?-«- ar + gr)/qr] 
(2) cosh(Bpx) — [cosh (Bpq)sinh («Bxr) + cosh (Bpr) sinh ]/sinh (Bqr) 
(3) cos(Bpx) — [cos (8pq)sin (Ber) + cos(Bpr)sin (Beg) ]/sin(Bgr). 


To see the sufficiency put sup pat (11,2), where pa? traverses 
all numbers for which S(a, pa‘) satisfies IV or V respectively. If b e S(a, da‘) 
then S(b, 84‘ — ab) C S(a, &*), so that 3,4 = &‘—ab; hence by symmetry 


(4) | 8.4 — & + | < ab, 


Because of I, II, III any two points z, y of R can be connected by a 
segment t(z,y). If the three points b, c, d are different and bc + cd = bd 
we write (bcd). The relations (cbe) and (bde) imply (cbd) and (cde) and 
conversely (compare B, Section 1). The following is a consequence of condi- 
tions I to IV. 


(5) If p, ge S(a, 83/3) then any segment t = t(p,q) is a subsegment of a 


segment t(p’,q’) with ap’ = aq’ = 28,'/3. 


For by the preceding remarks and IV there are pairs of points 2’, y’ with 


341 
a 
if 
q 

(11,2). 

h 
ff 


342 HERBERT BUSEMANN. 


S 28/3, ay’ S 28/3, (pqy’), (a’py’) hence also (2’pq) and (2’qy’). 
Because of II these pairs 2’,y’ form with the natural metric (that is 
(2'1, (22, y’2) = + a non-empty compact set; hence there is a 
pair p’,q’ for which 2’y’ reaches its maximum p’q’. Then p’,q’ satisfy (5), 
For if ap’ < 28&'/3 a point 2* with (2*p’q’) and ar* = 28'/3 would exist. 
But then also (2*pq’) and (2*p’q’) so that x*, g’ would be an admissible pair 
x’, y’ with > p’q’. Then t(p’,p) vt ut(q,q’) is a segment which satis- 
fies (5). 


Next observe the following consequence of I and V 


(6) If the points g, r, 21, of S(a,8") satisfy the relations qa; + 
=qr>0 (¢(=—1,2) and ar, then 7 2. 


For then also gz, = qz2, hence 


Now put pa = min(8,'/3, 8.2). Then (5) and (6) show that for any 
two points p,q of S(a,pa) and every a > 0 points r in S(a,pa) with (pqr) 
and gr < @ exist, and that (pqr’) and gr = qr’ imply r=7’. Hence the basic 
axiom D of B. p. 215 holds, so that & is a G-space (compare B. p. 227). 

The only one-dimensional G-spaces are the straight line and the great 
circles (B. p. 233). In this case the following considerations are trivial, there- 
fore the space will be assumed to have at least dimension 2. 

If p, p’e S(a, pa/2) and pp’ call B(p, p’) the locus of those points zx 
for which pr = pz’. If q,r are points of B(p, p’) © S8(a, pa/2), then a seg- 
ment t(q,7) lies in S(a,pa) (B. 1.15). If x is a point of this segment then 


px = pq, pr, tr) = dal p’r, cr) = p’ax 


so that t(q,7) C B(p, p’). The neighborhood S(a, pa/2) has, therefore, linear 
bisectors (compare B. p. 262 condition (*)), so that the theorem follows from 
the First Characterization of the spaces with constant curvature in B. p. 268. 

If in any metric space four points p,q,2,7r with gz + exr = qr > 0 are 


congruent to four points of a hyperbolic space of curvature — B’, then the 
relation (2) holds. A similar remark applies to the euclidean case and to the 
spherical case if Bar <2. Hence we find 


Corotiary 1. If R satisfies conditions I to IV and if every point a of 
R has a neighborhood S(a, pa) such that any four points r, x in S (a, pa*) 
with (qrr) are congruent to a quadruple of points in a space R(a) which ts 
euclidean, hyperbolic or spherical, then R has constant curvuture. 


} 


METRICALLY HOMOGENEOUS SPACES. 343 


It is of interest to study the consequences of V if required in the large. 


Conoruany 2. If satisfies I to IV and if a function és, 
exists such that 
pq, pr; ©q; 


when qu + exr = qr >0, then R is euchdean or hyperbolic. 


The Theorem shows that #& has constant curvature, whereas (6) implies 
that the segment t(a, b) is unique for any a,b. Any geodesic in R is contained 
in a two-dimensional surface of constant curvature. The elliptic plane and 
the sphere are the only surfaces of constant positive curvature; * hence every 


space of constant positive curvature contains points a,b for which t(a, b) 


is not unique. 

The only surfaces of non-positive constant curvature on which shortest 
connections are unique are the euclidean and hyperbolic planes. This proves 
Corollary 2. 

The statement in the large corresponding to Corollary 1 is this: 


CoroLtiary 3. If IJ to IV hold and any quadruple of points in R is con- 
gruent to a quadruple of points in a euclidean space or a hyperbolic space of 
curvature — B° or a spherical space of curvature B°, then R is a euclidean 


hyperbolic or spherical space. 


The formulas (1) and (2) hold in the first two cases, hence the assump- 
tion of Corollary 2 are satisfied. 

In the spherical case (3) holds only for Bgr < z. But since (3) holds in 
the small the Theorem shows that # has constant positive curvature. Again,* 
the only two-dimensional totally geodesic subspaces of R are spheres and 
elliptic planes of curvature 8°. An elliptic plane contains four points 
1, (on a geodesic such that = = A304 = 040, = 7B/4, 
and This quadruple is not congruent to a quad- 
ruple on a sphere of radius 1/f. 

Condition II in Corollary 3 is stronger than completeness and separability 
assumed by Wilson and Blumenthal, consequently their results include infinite 
dimensional spaces. On the other hand in the euclidean and hyperbolic cases, 
these authors require external convexity, that is IV in the large. (Compare 
Blumenthal [1], p. 69 Theorem 6. 4). 


*See Cartan [1], p. 174. 


r 
1 
1 
3 
f § 


844 HERBERT BUSEMANN. 


Spherical spaces are not externally convex. Therefore Blumenthal [1, q 
p. 74, Theorem 8.3] replaces external convexity by diametrization, which © 
means that for every point p a point p’ with pp’ —7/B exists. The present 
condition IV has the advantage of applying to all three cases. 


SMITH COLLEGE. 
NORTHAMPTON, MASS. 


REFERENCES. 


Blumenthal, L. M. [1], Distance Geometries, University of Missouri Studies XIII : 
(1938), No. 2. 

Busemann, H. [1], “ Local metric geometry,” T'’ransactions of the American Mathematical 7 
Society, vol. 56 (1944), pp. 200-274. 4 

Cartan, E. [1], Lecons sur la géométrie des espaces de Riemann, Paris, 1928. 

Vaughan, H. E. [1], “On locally compact metrizable spaces,” Bulletin of the American 
Mathematical Society, vol. 43 (1937), pp. 532-535. 


ig 


i 

( 

\ 
\ : 
~ 
4 \ 
\ 


